Skip to content

feat: security hardening -- sandbox, risk override, SSRF self-heal, DAST fix#1100

Merged
Aureliolo merged 17 commits intomainfrom
feat/security-hardening
Apr 7, 2026
Merged

feat: security hardening -- sandbox, risk override, SSRF self-heal, DAST fix#1100
Aureliolo merged 17 commits intomainfrom
feat/security-hardening

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

Security hardening batch implementing 4 issues: sandbox improvements, SecOps risk tier override, self-healing SSRF settings, and DAST report filename fix.

Changes

#1098 -- Fix DAST report artifact filenames

  • Update -J/-r flags to match ZAP's actual output naming (report_json.json, report_html.html)
  • Add if-no-files-found: warn for debugging

#696 -- Sandbox security improvements

  • Add runtime_overrides to DockerSandboxConfig for per-category container runtime selection
  • Create SandboxRuntimeResolver with Docker daemon probing and automatic runc fallback when gVisor unavailable
  • Inject default gVisor overrides for code_execution and terminal via merge_gvisor_defaults()
  • Create SandboxPolicy 4-domain model (filesystem, network, process, inference) as optional overlay
  • Create SandboxCredentialManager for env var credential stripping (14 patterns)
  • Create SandboxAuthProxy interface stub for future LLM traffic proxy

#222 -- SecOps risk tier reclassification override

  • Create RiskTierOverride frozen model with mandatory expiration, revocation, paired-nullability validation
  • Create SecOpsRiskClassifier wrapping RiskClassifier with runtime override support (last-added-wins)
  • Create RiskOverrideRepository protocol and SQLiteRiskOverrideRepository with shared write lock, proper error handling, rollback

#671 -- Self-healing SSRF settings

  • Create SsrfViolation model with pending/allowed/denied lifecycle and resolution tracking
  • Create SsrfViolationRepository protocol and SQLiteSsrfViolationRepository with shared write lock, PENDING transition guard
  • Create SecuritySubscriber watching providers/discovery_allowlist with JSON parsing and callback invocation
  • Add partial composite indexes for active-override and violation-by-status queries

Review coverage

Pre-reviewed by 7 specialized agents (security-reviewer, code-reviewer, type-design-analyzer, test-analyzer, silent-failure-hunter, docs-consistency, persistence-reviewer). 28 findings identified and addressed before PR creation.

Test plan

  • 14,702 unit tests pass (0 failures, 5 platform skips)
  • mypy strict: 0 errors across 1,621 source files
  • ruff: all checks pass
  • All pre-push hooks pass (mypy, pytest, ruff, gitleaks)

Follow-up items

  • docs/design/operations.md sections for 4-domain sandbox policy, SecOps override, and SSRF self-healing (design spec update)
  • SQLite repository dedicated tests (test_risk_override_repo.py, test_ssrf_violation_repo.py)
  • API controllers for risk overrides and SSRF violations (endpoints defined in issue scope but deferred to API layer PR)

Closes #1098
Closes #696
Closes #222
Closes #671

Copilot AI review requested due to automatic review settings April 6, 2026 23:02
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview April 6, 2026 23:03 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust security and sandboxing infrastructure, including a runtime risk reclassification system, SSRF violation tracking with self-healing allowlists, and enhanced sandbox isolation. Key components include a 4-domain sandbox policy model, a credential manager for environment variable sanitization, and a gVisor runtime resolver with automatic fallback. Persistence for these features is provided through new SQLite repositories. Feedback highlights two critical syntax errors in the exception handling blocks of the newly added repository implementations.

for row in rows:
try:
results.append(_row_to_override(row))
except ValueError, ValidationError:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The except clause uses invalid syntax for catching multiple exceptions in Python 3. It should use a tuple: except (ValueError, ValidationError):. As written, this will result in a SyntaxError when the module is loaded.

Suggested change
except ValueError, ValidationError:
except (ValueError, ValidationError):

for row in rows:
try:
results.append(_row_to_violation(row))
except ValueError, ValidationError:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The except clause uses invalid syntax for catching multiple exceptions in Python 3. It should use a tuple: except (ValueError, ValidationError):. As written, this will result in a SyntaxError when the module is loaded.

Suggested change
except ValueError, ValidationError:
except (ValueError, ValidationError):

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 6, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 73d2013.
Ensure that dependencies are being submitted on PR branches. Re-running this action after a short time may resolve the issue. See the documentation for more information and troubleshooting advice.

Scanned Files

None

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 6, 2026

Codecov Report

❌ Patch coverage is 72.19917% with 134 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.00%. Comparing base (d3022c7) to head (73d2013).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...synthorg/persistence/sqlite/ssrf_violation_repo.py 28.04% 59 Missing ⚠️
.../synthorg/persistence/sqlite/risk_override_repo.py 31.57% 52 Missing ⚠️
src/synthorg/security/rules/risk_override.py 90.62% 3 Missing and 3 partials ⚠️
...nthorg/settings/subscribers/security_subscriber.py 83.33% 5 Missing and 1 partial ⚠️
src/synthorg/tools/sandbox/docker_sandbox.py 73.33% 3 Missing and 1 partial ⚠️
src/synthorg/security/ssrf_violation.py 91.66% 2 Missing and 1 partial ⚠️
src/synthorg/persistence/sqlite/backend.py 85.71% 2 Missing ⚠️
src/synthorg/tools/sandbox/auth_proxy.py 90.47% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1100      +/-   ##
==========================================
- Coverage   89.18%   89.00%   -0.18%     
==========================================
  Files         791      802      +11     
  Lines       46515    46993     +478     
  Branches     4680     4715      +35     
==========================================
+ Hits        41484    41827     +343     
- Misses       4158     4285     +127     
- Partials      873      881       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

This comment was marked as resolved.

@Aureliolo
Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 7, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 7, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4b8cde41-f746-43f0-9b88-16b918f86607

📥 Commits

Reviewing files that changed from the base of the PR and between af3a140 and 73d2013.

📒 Files selected for processing (3)
  • src/synthorg/persistence/sqlite/schema.sql
  • tests/unit/persistence/sqlite/test_migrations.py
  • tests/unit/security/test_ssrf_violation.py
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Deploy Preview
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Build Sandbox
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations -- Python 3.14 has PEP 649
Use except A, B: syntax without parentheses -- ruff enforces this on Python 3.14
Line length: 88 characters (ruff)

Files:

  • tests/unit/persistence/sqlite/test_migrations.py
  • tests/unit/security/test_ssrf_violation.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Async: asyncio_mode = 'auto' -- no manual @pytest.mark.asyncio needed
Timeout: 30 seconds per test (global in pyproject.toml -- do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed)
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases
Property-based testing with Hypothesis: use @given + @settings. Profiles in tests/conftest.py: ci (deterministic, max_examples=10 + derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline). Controlled via HYPOTHESIS_PROFILE env var
Failing Hypothesis examples are persisted to ~/.synthorg/hypothesis-examples/ (write-only shared DB) via _WriteOnlyDatabase in tests/conftest.py
Flaky tests: NEVER skip, dismiss, or ignore -- always fix them fully. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For blocking tasks, use asyncio.Event().wait() instead of asyncio.sleep(large_number)

Files:

  • tests/unit/persistence/sqlite/test_migrations.py
  • tests/unit/security/test_ssrf_violation.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/persistence/sqlite/test_migrations.py
  • tests/unit/security/test_ssrf_violation.py
🧠 Learnings (8)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T21:32:02.880Z
Learning: Applies to .github/workflows/docker.yml : Docker workflow: builds backend + web + sandbox images, pushes to GHCR, signs with cosign. SLSA L3 provenance attestations via actions/attest-build-provenance. Scans: Trivy (CRITICAL = hard fail, HIGH = warn) + Grype (critical cutoff) + CIS Docker Benchmark v1.6.0 compliance (informational). CVE triage via .github/.trivyignore.yaml and .github/.grype.yaml. Images only pushed after scans pass. Triggers on push to main and version tags (v*).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-06T16:35:12.934Z
Learning: Agents in the Parallel tier (Haiku-class) include 14 reviewers: async-concurrency-reviewer, comment-analyzer, design-token-audit, docs-consistency, go-conventions-enforcer, go-security-reviewer, infra-reviewer, issue-resolution-verifier, persistence-reviewer, pr-test-analyzer, security-reviewer, silent-failure-hunter, test-quality-reviewer, and type-design-analyzer
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T10:18:35.230Z
Learning: After the PR exists, use /aurelio-review-pr to handle external reviewer feedback
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to .github/zap-rules.tsv : ZAP DAST security scanning rules in `.github/zap-rules.tsv`
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to tests/**/*.py : Tests must use `test-provider`, `test-small-001`, etc. instead of real vendor names

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to {src,tests,web,cli,site}/**/*.{py,ts,tsx,go,astro} : Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001. Vendor names may only appear in: (1) Operations design page provider list (docs/design/operations.md), (2) .claude/ skill/agent files, (3) third-party import paths/module names.

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
📚 Learning: 2026-04-07T10:18:35.230Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T10:18:35.230Z
Learning: Applies to src/**/*.py : Vendor-agnostic everywhere: NEVER use real vendor names in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small. Vendor names only in: Operations design page, .claude/ files, third-party imports, provider presets (src/synthorg/providers/presets.py)

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to {src,tests}/**/*.py : Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: `example-provider`, `example-large-001`, `example-medium-001`, `example-small-001`, `large`/`medium`/`small` as aliases

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to {src/synthorg/**/*.py,tests/**/*.py,web/src/**/*.{ts,tsx}} : NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names: `example-provider`, `example-large-001`, `example-medium-001`, `example-small-001`, `large`/`medium`/`small`

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Tests must use test-provider, test-small-001, etc. for vendor-agnostic test data.

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to web/**/*.vue : Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in dashboard code — use generic names: `example-provider`, `example-large-001`, etc.

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
🔇 Additional comments (11)
src/synthorg/persistence/sqlite/schema.sql (3)

530-553: LGTM!

The risk_overrides table correctly implements:

  • Paired-nullability CHECK constraint for revoked_at/revoked_by (lines 542-546)
  • Optimized partial index idx_ro_active with (created_at DESC, expires_at) column order for the list_active() query pattern

Both address the previous review feedback.


555-576: LGTM!

The ssrf_violations table correctly implements:

  • Port range constraint (1-65535) matching the model's Field constraint
  • Status/resolution consistency CHECK constraint (lines 569-575) that mirrors the SsrfViolation._validate_resolution validator

This addresses the previous review feedback about enforcing lifecycle invariants at the DB layer.


578-582: LGTM!

The indexes are well-designed:

  • idx_sv_status_timestamp supports the common list_violations(status=...) query pattern
  • idx_sv_hostname as a composite (hostname, port) index is a good choice for allowlist lookups that typically check hostname:port pairs

This addresses the previous review suggestion about adding a hostname index.

tests/unit/security/test_ssrf_violation.py (6)

15-21: LGTM!

Clean enum value assertions covering all three status values.


27-70: LGTM!

The happy-path tests comprehensively cover all three states (PENDING, ALLOWED, DENIED) with complete assertions including resolved_at (lines 55, 70). The vendor-agnostic test-provider naming is correctly used.


72-81: LGTM!

Correctly verifies the model's immutability from frozen=True in the ConfigDict.


83-127: LGTM!

Complete validation coverage for the cross-field _validate_resolution constraints:

  • PENDING with resolved_by rejected (lines 83-93)
  • PENDING with resolved_at rejected (lines 95-105) — addresses previous review feedback
  • ALLOWED without resolved_by rejected (lines 107-116)
  • DENIED without resolved_by rejected (lines 118-127) — addresses previous review feedback

129-159: LGTM!

Good boundary testing for port constraints (0 rejected, 65535 accepted) and verification of optional field defaults.


161-174: LGTM!

Standard serialization roundtrip test ensuring model_dump()model_validate() preserves all data.

tests/unit/persistence/sqlite/test_migrations.py (2)

42-43: LGTM!

The expected tables set correctly includes the two new tables (risk_overrides, ssrf_violations) matching schema.sql.


110-114: LGTM!

The expected indexes set correctly includes all five new indexes for the risk_overrides and ssrf_violations tables, matching schema.sql.


Walkthrough

Updates CI DAST workflow to match ZAP's report filenames and make artifact upload warn when reports are missing. Adds SecOps runtime risk-tier override model and classifier, SSRF violation model and status enum, observability event constants, persistence protocols and SQLite and in-memory repository implementations, and corresponding schema migrations. Introduces sandbox security components: SandboxPolicy (4-domain), runtime resolver with gVisor defaults and merge helper, credential manager, auth-proxy stub, DockerSandbox runtime wiring and resolver attachment, and a Settings subscriber for discovery allowlist hot-reload. Numerous unit tests added/updated.

Suggested labels

autorelease: tagged

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly summarizes the batch of security hardening changes including sandbox improvements, risk override, SSRF self-healing, and DAST fix across four linked issues.
Description check ✅ Passed Description is well-structured and comprehensive, detailing changes for all four linked issues, test results, and follow-up items. Directly related to the changeset.
Linked Issues check ✅ Passed PR addresses all four linked issues: #1098 (DAST filenames), #696 (sandbox security), #222 (risk tier override), and #671 (SSRF self-healing) with corresponding implementation across workflow, models, repositories, and event constants.
Out of Scope Changes check ✅ Passed All changes are scoped to the four linked issues. No unrelated modifications found beyond the security hardening objectives, documentation updates, and test coverage additions.
Docstring Coverage ✅ Passed Docstring coverage is 45.27% which is sufficient. The required threshold is 40.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/persistence/protocol.py`:
- Around line 245-253: Update the PersistenceBackend class docstring Attributes:
section to include the new protocol properties risk_overrides and
ssrf_violations by adding entries for risk_overrides: RiskOverrideRepository and
ssrf_violations: SsrfViolationRepository (matching the existing docstring
style/format used for other attributes), so the docstring reflects the actual
protocol surface defined by the risk_overrides and ssrf_violations `@property`
methods.

In `@src/synthorg/persistence/sqlite/schema.sql`:
- Around line 522-535: The ssrf_violations.port column currently allows any
integer; add a CHECK constraint to enforce valid TCP/UDP port bounds (0–65535)
so invalid ports cannot be persisted. Update the CREATE TABLE for
ssrf_violations (or ALTER TABLE if migrating) to add a CHECK on "port" (e.g.,
CHECK (port BETWEEN 0 AND 65535)) and ensure the constraint name/definition is
applied consistently during migrations; reference the ssrf_violations table and
port column when making the change.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py`:
- Around line 162-180: The list_violations method allows arbitrary integers for
the limit parameter which permits negative values (e.g., -1) to produce an
unbounded query; validate and sanitize limit at the start of list_violations:
reject or raise for non-positive values (<=0) and clamp any excessively large
positive values to a safe max (e.g., MAX_LIMIT constant). Update the signature
handling in list_violations to check the incoming limit, raise a ValueError or
return a clear error for invalid limits, and use the sanitized/clamped value in
the params tuple that is sent to SQLite.

In `@src/synthorg/security/rules/risk_override.py`:
- Around line 138-194: The classification loop currently logs
SECURITY_RISK_OVERRIDE_EXPIRED for any inactive override (override.is_active),
which wrongly classifies revoked overrides as expired; change the logic in
classify (where SECURITY_RISK_OVERRIDE_EXPIRED / SECURITY_RISK_OVERRIDE_APPLIED
are used) to detect revoked overrides (e.g., check override.revoked_at or
revoked_by on the RiskTierOverride) and emit SECURITY_RISK_OVERRIDE_REVOKED for
revoked entries while only emitting SECURITY_RISK_OVERRIDE_EXPIRED for
non-revoked expirations; additionally, add logger.info calls in add_override
(use SECURITY_RISK_OVERRIDE_CREATED with override.id, action_type/context,
original_tier, override_tier) and in revoke_override after creating revoked (use
SECURITY_RISK_OVERRIDE_REVOKED with override.id, revoked_by, revoked_at,
original_tier, override_tier) so all state transitions are logged at INFO.
- Around line 111-154: The constructor stores overrides in self._overrides
assuming oldest→newest but persisted tuples may be newest→oldest, causing
classify (which iterates reversed(self._overrides)) to pick the wrong
precedence; fix by normalizing ordering in __init__ (e.g., set self._overrides =
list(reversed(overrides)) or otherwise ensure the internal list is
oldest→newest) so classify(action_type) and its reversed iteration yield the
intended "last added wins" behavior; adjust only __init__ (constructor) so other
code using classify and _base.classify remains unchanged.

In `@src/synthorg/security/ssrf_violation.py`:
- Around line 44-52: The model currently allows any non-empty string for the url
field but must reject unredacted URLs before persistence; add a Pydantic
validator on the url field in the SsrfViolation model
(src/synthorg/security/ssrf_violation.py) that parses the URL
(urllib.parse.urlparse), and raises a ValidationError if the URL contains
userinfo (parsed.username or parsed.password) or if the query contains
suspicious parameter names like "password", "pass", "token", "secret",
"api_key", etc.; implement this as a `@validator`("url") (or `@field_validator` for
pydantic v2) so invalid/unredacted URLs are rejected at model construction
rather than being written by the repository (ssrf_violation_repo.py).

In `@src/synthorg/settings/subscribers/security_subscriber.py`:
- Around line 80-103: The two warning paths currently emit
SECURITY_ALLOWLIST_UPDATED which hides failures from event-name-based
dashboards; change the error event name to a distinct constant (e.g.,
SECURITY_ALLOWLIST_UPDATE_FAILED) and use that constant in the warning logger
calls (the two places that currently call logger.warning with
SECURITY_ALLOWLIST_UPDATED), leaving logger.info to continue using
SECURITY_ALLOWLIST_UPDATED on success, and ensure the new constant is defined
and any consumers/tests that expect the old name are updated accordingly; keep
the call to await self._on_changed(allowlist) and the existing error kwarg
values unchanged.

In `@src/synthorg/tools/sandbox/auth_proxy.py`:
- Around line 44-61: The start method SandboxAuthProxy.start currently raises
NotImplementedError without logging; modify it to log a warning or error with
context before raising (e.g., use the module logger via
logging.getLogger(__name__) or an existing module-level logger) so the message
includes that the auth proxy start is unimplemented and any relevant parameters
(like port), then raise the NotImplementedError as before.

In `@src/synthorg/tools/sandbox/credential_manager.py`:
- Around line 64-104: The container path bypasses credential stripping: update
the code that builds the container Env list to run env_overrides through the
credential manager (use sanitize_env or sanitize_env_with_report from
CredentialManager which calls _do_sanitize) before passing values into
_validate_env or constructing the Docker Env list; specifically, obtain a
CredentialManager instance, replace uses of the raw env_overrides (or direct
passing into _validate_env/Env construction) with the sanitized dict returned
from sanitize_env (or capture stripped keys with sanitize_env_with_report for
logging) so secrets are removed prior to creating the container environment.

In `@src/synthorg/tools/sandbox/runtime_resolver.py`:
- Around line 25-132: The container-creation path never uses
SandboxRuntimeResolver.resolve_runtime, so per-category overrides and gVisor
fallback are ignored; update the code that currently sets HostConfig["Runtime"]
from self._config.runtime to call the resolver and use its result. Specifically,
in the container-creation method that currently assigns HostConfig["Runtime"]
(replace the direct use of self._config.runtime), obtain the effective runtime
via SandboxRuntimeResolver.resolve_runtime(category) (or
self._runtime_resolver.resolve_runtime(category) if you add/hold an instance),
set HostConfig["Runtime"] only when the resolver returns a non-None string, and
otherwise omit the Runtime key so Docker’s default is used; ensure logging/error
handling remains consistent with the resolver behavior.
- Around line 107-117: The current branch in runtime_resolver where override =
self._config.runtime_overrides.get(category) returns None when a per-category
override exists but isn't in self._available, which prevents falling back to the
global self._config.runtime; instead, remove the immediate return and only emit
the logger.warning (using SANDBOX_GVISOR_FALLBACK with
category/requested/available) so the method can continue and evaluate
self._config.runtime and the documented resolver order; update the code path
around runtime_overrides.get(category), the override in self._available check,
and the logger.warning so missing per-category overrides don't short-circuit
resolution.

In `@tests/unit/api/fakes_backend.py`:
- Around line 177-183: The fake backend's properties risk_overrides and
ssrf_violations currently return object() and are typed Any; change them to
return proper fake repository instances that implement the
RiskOverrideRepository and SsrfViolationRepository interfaces (e.g., create
lightweight FakeRiskOverrideRepository and FakeSsrfViolationRepository or reuse
existing fake repo classes) and update the property return types to
RiskOverrideRepository and SsrfViolationRepository so the fake
PersistenceBackend satisfies the protocol and preserves type safety; ensure the
properties are named risk_overrides and ssrf_violations and return instances
with the minimal methods used by tests.

In `@tests/unit/tools/sandbox/test_runtime_resolver.py`:
- Around line 122-127: The test test_probe_returns_runc_fallback_on_failure is
nondeterministic because it relies on the CI environment; update the test to
mock the Docker probing used by SandboxRuntimeResolver.probe_available_runtimes
so you can force both outcomes: (1) simulate Docker probe failure and assert the
result equals or contains frozenset({"runc"}) (fallback path), and (2) simulate
Docker probe success returning a known set and assert those runtimes are present
(success path); locate and mock the internal function or method that performs
the Docker check/call inside SandboxRuntimeResolver (e.g., the Docker probe
helper used by probe_available_runtimes) and restore the original behavior after
each scenario to keep tests isolated.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2806967d-39e3-4c46-81ac-6829687b93d6

📥 Commits

Reviewing files that changed from the base of the PR and between c845d22 and 767d5ec.

📒 Files selected for processing (33)
  • .github/workflows/dast.yml
  • CLAUDE.md
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/observability/events/security.py
  • src/synthorg/persistence/protocol.py
  • src/synthorg/persistence/risk_override_repo.py
  • src/synthorg/persistence/sqlite/backend.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
  • src/synthorg/persistence/sqlite/schema.sql
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/ssrf_violation_repo.py
  • src/synthorg/security/rules/risk_override.py
  • src/synthorg/security/ssrf_violation.py
  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/tools/sandbox/credential_manager.py
  • src/synthorg/tools/sandbox/docker_config.py
  • src/synthorg/tools/sandbox/factory.py
  • src/synthorg/tools/sandbox/policy.py
  • src/synthorg/tools/sandbox/runtime_resolver.py
  • tests/unit/api/fakes_backend.py
  • tests/unit/persistence/sqlite/test_migrations.py
  • tests/unit/persistence/test_protocol.py
  • tests/unit/security/test_risk_override.py
  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/settings/test_security_subscriber.py
  • tests/unit/tools/sandbox/test_auth_proxy.py
  • tests/unit/tools/sandbox/test_credential_manager.py
  • tests/unit/tools/sandbox/test_docker_config.py
  • tests/unit/tools/sandbox/test_policy.py
  • tests/unit/tools/sandbox/test_runtime_resolver.py
  • tests/unit/tools/sandbox/test_sandbox_factory.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: except A, B: (no parentheses)—ruff enforces this on Python 3.14.
Type hints: all public functions required, mypy strict mode enforced.
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules).
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens).
Use NotBlankStr (from core.types) for all identifier/name fields—including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants—instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff enforced).
Functions: < 50 lines, files < 800 lines.
Handle errors explicitly, never silently swallow.
Validate at system boundaries (user input, external APIs, config files).
Variable name: always logger (not _logger, not log).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO.
DEBUG for object creation, internal flow, entry/exit of key functions.
Pure data models, enums, and re-exports do NOT need logging.

Files:

  • tests/unit/persistence/sqlite/test_migrations.py
  • tests/unit/api/fakes_backend.py
  • tests/unit/persistence/test_protocol.py
  • src/synthorg/persistence/protocol.py
  • tests/unit/tools/sandbox/test_docker_config.py
  • tests/unit/tools/sandbox/test_auth_proxy.py
  • src/synthorg/persistence/sqlite/backend.py
  • tests/unit/tools/sandbox/test_runtime_resolver.py
  • tests/unit/settings/test_security_subscriber.py
  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/tools/sandbox/docker_config.py
  • tests/unit/tools/sandbox/test_policy.py
  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/tools/sandbox/factory.py
  • tests/unit/tools/sandbox/test_credential_manager.py
  • tests/unit/tools/sandbox/test_sandbox_factory.py
  • src/synthorg/persistence/risk_override_repo.py
  • tests/unit/security/test_ssrf_violation.py
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/tools/sandbox/credential_manager.py
  • tests/unit/security/test_risk_override.py
  • src/synthorg/observability/events/security.py
  • src/synthorg/persistence/ssrf_violation_repo.py
  • src/synthorg/security/ssrf_violation.py
  • src/synthorg/tools/sandbox/runtime_resolver.py
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
  • src/synthorg/tools/sandbox/policy.py
  • src/synthorg/security/rules/risk_override.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow.
Async: asyncio_mode = "auto"—no manual @pytest.mark.asyncio needed.
Timeout: 30 seconds per test (global in pyproject.toml—do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed).
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases.
Tests must use test-provider, test-small-001, etc. (not real vendor names).
Property-based testing: Python uses Hypothesis (@given + @settings).
Hypothesis profiles configured in tests/conftest.py: ci (deterministic, max_examples=10 + derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline). Controlled via HYPOTHESIS_PROFILE env var.
Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Files:

  • tests/unit/persistence/sqlite/test_migrations.py
  • tests/unit/api/fakes_backend.py
  • tests/unit/persistence/test_protocol.py
  • tests/unit/tools/sandbox/test_docker_config.py
  • tests/unit/tools/sandbox/test_auth_proxy.py
  • tests/unit/tools/sandbox/test_runtime_resolver.py
  • tests/unit/settings/test_security_subscriber.py
  • tests/unit/tools/sandbox/test_policy.py
  • tests/unit/tools/sandbox/test_credential_manager.py
  • tests/unit/tools/sandbox/test_sandbox_factory.py
  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/security/test_risk_override.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/persistence/sqlite/test_migrations.py
  • tests/unit/api/fakes_backend.py
  • tests/unit/persistence/test_protocol.py
  • tests/unit/tools/sandbox/test_docker_config.py
  • tests/unit/tools/sandbox/test_auth_proxy.py
  • tests/unit/tools/sandbox/test_runtime_resolver.py
  • tests/unit/settings/test_security_subscriber.py
  • tests/unit/tools/sandbox/test_policy.py
  • tests/unit/tools/sandbox/test_credential_manager.py
  • tests/unit/tools/sandbox/test_sandbox_factory.py
  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/security/test_risk_override.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(name)
Never use import logging / logging.getLogger() / print() in application code. Exception: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, and observability/http_handler.py may use stdlib logging and print(..., file=sys.stderr).
Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api). Import directly: from synthorg.observability.events. import EVENT_CONSTANT
Structured kwargs: always logger.info(EVENT, key=value)—never logger.info("msg %s", val)
All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig.
Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError.
Non-retryable errors raise immediately without retry.
RetryExhaustedError signals that all retries failed—the engine layer catches this to trigger fallback chains.
Rate limiter respects RateLimitError.retry_after from providers—automatically pauses future requests.

Files:

  • src/synthorg/persistence/protocol.py
  • src/synthorg/persistence/sqlite/backend.py
  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/tools/sandbox/docker_config.py
  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/tools/sandbox/factory.py
  • src/synthorg/persistence/risk_override_repo.py
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/tools/sandbox/credential_manager.py
  • src/synthorg/observability/events/security.py
  • src/synthorg/persistence/ssrf_violation_repo.py
  • src/synthorg/security/ssrf_violation.py
  • src/synthorg/tools/sandbox/runtime_resolver.py
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
  • src/synthorg/tools/sandbox/policy.py
  • src/synthorg/security/rules/risk_override.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/persistence/protocol.py
  • src/synthorg/persistence/sqlite/backend.py
  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/tools/sandbox/docker_config.py
  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/tools/sandbox/factory.py
  • src/synthorg/persistence/risk_override_repo.py
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/tools/sandbox/credential_manager.py
  • src/synthorg/observability/events/security.py
  • src/synthorg/persistence/ssrf_violation_repo.py
  • src/synthorg/security/ssrf_violation.py
  • src/synthorg/tools/sandbox/runtime_resolver.py
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
  • src/synthorg/tools/sandbox/policy.py
  • src/synthorg/security/rules/risk_override.py
🧠 Learnings (60)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T21:32:02.880Z
Learning: Applies to .github/workflows/docker.yml : Docker workflow: builds backend + web + sandbox images, pushes to GHCR, signs with cosign. SLSA L3 provenance attestations via actions/attest-build-provenance. Scans: Trivy (CRITICAL = hard fail, HIGH = warn) + Grype (critical cutoff) + CIS Docker Benchmark v1.6.0 compliance (informational). CVE triage via .github/.trivyignore.yaml and .github/.grype.yaml. Images only pushed after scans pass. Triggers on push to main and version tags (v*).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to .github/zap-rules.tsv : ZAP DAST security scanning rules in `.github/zap-rules.tsv`
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Security scanning: gitleaks (push/PR + weekly), zizmor (workflow analysis), OSSF Scorecard (weekly), Socket.dev (PR supply chain), ZAP DAST (weekly + manual).
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to .github/zap-rules.tsv : ZAP DAST security scanning rules in `.github/zap-rules.tsv`

Applied to files:

  • .github/workflows/dast.yml
📚 Learning: 2026-03-15T18:17:43.675Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:17:43.675Z
Learning: Applies to .github/workflows/**/*.yml : Path filtering: dorny/paths-filter detects Python/dashboard/docker changes; jobs only run when their domain is affected. CLI has its own workflow (cli.yml).

Applied to files:

  • .github/workflows/dast.yml
📚 Learning: 2026-03-15T21:32:02.880Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T21:32:02.880Z
Learning: Applies to .github/workflows/*.yml : Dependabot: daily updates for uv + github-actions + npm + pre-commit + docker + gomod, grouped minor/patch, no auto-merge. Use `/review-dep-pr` to review Dependabot PRs before merging.

Applied to files:

  • .github/workflows/dast.yml
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/persistence/**/*.py : Persistence uses pluggable PersistenceBackend protocol. SQLite is the initial backend. Settings use SettingsRepository (namespaced settings CRUD).

Applied to files:

  • src/synthorg/persistence/protocol.py
  • src/synthorg/persistence/sqlite/backend.py
  • src/synthorg/persistence/risk_override_repo.py
  • src/synthorg/persistence/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Persistence backend: pluggable PersistenceBackend protocol in `src/synthorg/persistence/`, SQLite initial, SettingsRepository (namespaced settings CRUD).

Applied to files:

  • src/synthorg/persistence/protocol.py
  • src/synthorg/persistence/sqlite/backend.py
  • src/synthorg/persistence/risk_override_repo.py
  • src/synthorg/persistence/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

  • src/synthorg/persistence/protocol.py
  • src/synthorg/persistence/sqlite/backend.py
  • src/synthorg/persistence/risk_override_repo.py
  • src/synthorg/persistence/ssrf_violation_repo.py
  • CLAUDE.md
  • src/synthorg/tools/sandbox/policy.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docker/Dockerfile.sandbox : Docker sandbox: `synthorg-sandbox` — Python 3.14 + Node.js + git, non-root (UID 10001), agent code execution sandbox

Applied to files:

  • tests/unit/tools/sandbox/test_docker_config.py
  • src/synthorg/tools/sandbox/docker_config.py
  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/tools/sandbox/factory.py
  • tests/unit/tools/sandbox/test_sandbox_factory.py
  • src/synthorg/tools/sandbox/credential_manager.py
  • src/synthorg/tools/sandbox/runtime_resolver.py
  • src/synthorg/tools/sandbox/policy.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
  • tests/unit/security/test_risk_override.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to tests/**/*.py : Never skip, dismiss, or ignore flaky tests; fix them fully and fundamentally by mocking `time.monotonic()` and `asyncio.sleep()` for determinism or using `asyncio.Event().wait()` for indefinite blocking

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-04-06T20:42:41.219Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Applies to tests/**/*.py : Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mock `time.monotonic()` and `asyncio.sleep()` for determinism; use `asyncio.Event().wait()` for indefinite blocking instead of `asyncio.sleep(large_number)`

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to tests/**/*.py : Flaky tests: NEVER skip, dismiss, or ignore flaky tests -- always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)

Applied to files:

  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/observability/events/security.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.

Applied to files:

  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/observability/events/security.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/observability/events/security.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/observability/events/security.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.

Applied to files:

  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/observability/events/security.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly from the domain module

Applied to files:

  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/observability/events/security.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/observability/events/persistence.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly rather than using string literals

Applied to files:

  • src/synthorg/observability/events/sandbox.py
📚 Learning: 2026-03-31T16:09:24.320Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:09:24.320Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly and use in structured logging

Applied to files:

  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/observability/events/security.py
📚 Learning: 2026-04-06T20:42:41.219Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT

Applied to files:

  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/observability/events/persistence.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/synthorg/tools/sandbox/docker_config.py
  • src/synthorg/security/ssrf_violation.py
  • src/synthorg/tools/sandbox/policy.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/tools/sandbox/docker_config.py
📚 Learning: 2026-04-06T20:42:41.219Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__)

Applied to files:

  • src/synthorg/tools/sandbox/docker_config.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/tools/sandbox/docker_config.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.

Applied to files:

  • src/synthorg/tools/sandbox/docker_config.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Handle errors explicitly, never silently swallow. Validate at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/tools/sandbox/docker_config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.

Applied to files:

  • src/synthorg/tools/sandbox/docker_config.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/tools/sandbox/docker_config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls

Applied to files:

  • src/synthorg/observability/events/persistence.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/settings/**/*.py : Settings package (settings/): runtime-editable settings persistence (DB > env > YAML > code defaults), typed definitions (9 namespaces), Fernet encryption for sensitive values, config bridge (JSON serialization for Pydantic/collections), ConfigResolver (typed accessors), validation, registry, change notifications via message bus, SettingsSubscriber protocol, SettingsChangeDispatcher (polls `#settings` channel, routes to subscribers, restart_required filtering)

Applied to files:

  • src/synthorg/settings/subscribers/security_subscriber.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.

Applied to files:

  • src/synthorg/settings/subscribers/security_subscriber.py
  • tests/unit/security/test_risk_override.py
  • src/synthorg/observability/events/security.py
  • src/synthorg/security/ssrf_violation.py
  • CLAUDE.md
  • src/synthorg/tools/sandbox/policy.py
  • src/synthorg/security/rules/risk_override.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)

Applied to files:

  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/tools/sandbox/credential_manager.py
  • tests/unit/security/test_risk_override.py
  • src/synthorg/observability/events/security.py
  • src/synthorg/security/ssrf_violation.py
  • CLAUDE.md
  • src/synthorg/tools/sandbox/policy.py
  • src/synthorg/security/rules/risk_override.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Use event constants from `synthorg.observability.events.<domain>` (e.g., `API_REQUEST_STARTED` from `events.api`); import directly and log with structured kwargs: `logger.info(EVENT, key=value)`, never interpolated strings

Applied to files:

  • src/synthorg/observability/events/security.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves

Applied to files:

  • src/synthorg/security/ssrf_violation.py
  • src/synthorg/tools/sandbox/policy.py
  • src/synthorg/security/rules/risk_override.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state

Applied to files:

  • src/synthorg/security/ssrf_violation.py
  • src/synthorg/tools/sandbox/policy.py
  • src/synthorg/security/rules/risk_override.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.

Applied to files:

  • src/synthorg/security/ssrf_violation.py
📚 Learning: 2026-04-06T16:35:12.934Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-06T16:35:12.934Z
Learning: All project conventions, commands, and standards are defined in CLAUDE.md - refer to it for project structure, package layout, code conventions, quick commands, git workflow, testing standards, design specifications, logging, resilience, and security patterns

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-31T14:31:11.894Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:31:11.894Z
Learning: CLI (Go binary): see `cli/CLAUDE.md` for commands, flags, and reference; use `go -C cli` (never `cd cli`)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Documentation source in `docs/` (Markdown, built with Zensical). Design spec in `docs/design/` (7 pages: index, agents, organization, communication, engine, memory, operations). Architecture in `docs/architecture/` (overview, tech-stack, decision log). Roadmap in `docs/roadmap/`. Security in `docs/security.md`. Licensing in `docs/licensing.md`. Reference in `docs/reference/`. REST API reference in `docs/rest-api.md`. Library reference in `docs/api/` (auto-generated from docstrings). Custom templates in `docs/overrides/`. Config in `mkdocs.yml`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to docs/** : Docs source in docs/ (Markdown, built with Zensical); design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).

Applied to files:

  • CLAUDE.md
  • src/synthorg/security/rules/risk_override.py
📚 Learning: 2026-03-31T14:28:28.895Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:28:28.895Z
Learning: Web dashboard: see `web/CLAUDE.md` for commands, design system, and component inventory

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docs/design/*.md : Design spec pages: 7 pages in `docs/design/` — index, agents, organization, communication, engine, memory, operations

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to **/*.py : Use `except A, B:` (no parentheses) per PEP 758 exception syntax on Python 3.14

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Use PEP 758 except syntax with `except A, B:` (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) per PEP 758 for exception handling in Python 3.14

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
📚 Learning: 2026-04-06T20:42:41.219Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses)—ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Handle errors explicitly; never silently swallow exceptions

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).

Applied to files:

  • src/synthorg/tools/sandbox/policy.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

♻️ Duplicate comments (3)
src/synthorg/tools/sandbox/docker_sandbox.py (1)

461-466: ⚠️ Potential issue | 🔴 Critical

category is never passed — runtime resolver remains unused.

_run_container calls _build_container_config without the category argument, so it defaults to "". In _resolve_runtime, the condition if self._runtime_resolver is not None and category: (line 312) is falsy when category="", causing the resolver to be bypassed and self._config.runtime to be returned instead.

The entire resolver integration is dead code until category flows through execute()_run_container()_build_container_config().

🐛 Proposed fix to propagate category through the call chain
     async def execute(
         self,
         *,
         command: str,
         args: tuple[str, ...],
         cwd: Path | None = None,
         env_overrides: Mapping[str, str] | None = None,
         timeout: float | None = None,
+        category: str = "",
     ) -> SandboxResult:
         ...
         return await self._run_container(
             docker=docker,
             command=command,
             args=args,
             container_cwd=container_cwd,
             env_overrides=env_overrides,
             timeout=effective_timeout,
+            category=category,
         )

     async def _run_container(
         self,
         *,
         docker: aiodocker.Docker,
         command: str,
         args: tuple[str, ...],
         container_cwd: str,
         env_overrides: Mapping[str, str] | None,
         timeout: float,
+        category: str = "",
     ) -> SandboxResult:
         ...
         config = self._build_container_config(
             command=command,
             args=args,
             container_cwd=container_cwd,
             env_overrides=env_overrides,
+            category=category,
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/tools/sandbox/docker_sandbox.py` around lines 461 - 466,
_run_container calls _build_container_config without forwarding the category,
causing _resolve_runtime's category check to fail and bypass the runtime
resolver; propagate the category from execute() through _run_container() into
_build_container_config() and then into _resolve_runtime() by adding a category
parameter to _run_container and ensuring _build_container_config accepts and
forwards category to whichever call invokes _resolve_runtime() (references:
execute, _run_container, _build_container_config, _resolve_runtime, category).
src/synthorg/persistence/sqlite/ssrf_violation_repo.py (1)

162-171: ⚠️ Potential issue | 🟠 Major

Bound limit on this public query path.

The new limit <= 0 guard fixes the negative-LIMIT bug, but any large positive value still becomes a potentially expensive full-history read. This method should also log before raising on invalid input.

📏 Proposed fix
 _COLS = (
     "id, timestamp, url, hostname, port, resolved_ip, "
     "blocked_range, provider_name, status, resolved_by, resolved_at"
 )
+_MAX_LIST_VIOLATIONS = 500
@@
         """List violations, optionally filtered by status."""
         if limit <= 0:
             msg = "limit must be positive"
+            logger.warning(
+                PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED,
+                error=msg,
+                limit=limit,
+            )
             raise ValueError(msg)
+        limit = min(limit, _MAX_LIST_VIOLATIONS)
 
         if status is not None:
As per coding guidelines: "Validate at system boundaries (user input, external APIs, config files)." Also, "All error paths must log at WARNING or ERROR with context before raising."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 162 -
171, In list_violations, validate and cap the incoming limit to a safe maximum
and log invalid values before raising: update SsrfViolationRepo.list_violations
to check for limit <= 0 and for limit > MAX_LIMIT (define a constant MAX_LIMIT
or use an existing one), call the logger (at WARNING) with context including the
provided limit and the caller intent, then either raise ValueError for
non-positive limits or clamp/raise for too-large limits per policy; reference
the method name list_violations and use the repository's logger instance when
adding the log calls.
src/synthorg/security/ssrf_violation.py (1)

9-9: ⚠️ Potential issue | 🟠 Major

Reject secret-bearing query parameters before persistence.

The validator now catches user:pass@host, but it still accepts URLs like ?token=... or ?api_key=..., and those values will be stored verbatim. Extend the redaction check to inspect query keys too.

🔐 Proposed fix
-from urllib.parse import urlparse
+from urllib.parse import parse_qsl, urlparse
@@
     `@field_validator`("url")
     `@classmethod`
     def _reject_unredacted_url(cls, v: str) -> str:
-        """Reject URLs that contain credentials in userinfo."""
+        """Reject URLs that contain credentials or secret query params."""
         parsed = urlparse(v)
         if parsed.username or parsed.password:
             msg = "url must be redacted -- credentials detected in userinfo"
             raise ValueError(msg)
+        sensitive_keys = {
+            "access_token",
+            "api_key",
+            "apikey",
+            "auth",
+            "pass",
+            "password",
+            "secret",
+            "token",
+        }
+        query_keys = {
+            key.lower() for key, _ in parse_qsl(parsed.query, keep_blank_values=True)
+        }
+        if query_keys & sensitive_keys:
+            msg = "url must be redacted -- sensitive query parameters detected"
+            raise ValueError(msg)
         return v
As per coding guidelines: "Validate at system boundaries (user input, external APIs, config files)."

Also applies to: 54-62

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/security/ssrf_violation.py` at line 9, The validator currently
parses URLs with urlparse but only checks for userinfo in netloc; extend it to
inspect query parameters too by using urllib.parse.parse_qsl (or parse_qs) to
extract query keys and check them against the existing secret-key blacklist
(e.g., token, api_key, password, secret, key, auth, access_token, bearer); if
any blacklisted key is present, reject the URL or redact the value before
persistence the same way the code handles user:pass@host. Update the URL
validation function in src/synthorg/security/ssrf_violation.py (the same routine
that handles the netloc/userinfo check; also adjust the logic referenced around
the current check at lines ~54-62) to perform this query-key inspection and
return/raise the same failure path used for userinfo detections.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/persistence/sqlite/schema.sql`:
- Around line 502-513: The risk_overrides table allows revoked_at and revoked_by
to be set independently, but the RiskTierOverride model requires them paired;
add a DB-level constraint on the risk_overrides table to enforce they are either
both NULL or both NOT NULL (e.g. a CHECK constraint like (revoked_at IS NULL AND
revoked_by IS NULL) OR (revoked_at IS NOT NULL AND revoked_by IS NOT NULL)),
update the CREATE TABLE for risk_overrides to include that CHECK, and add a
migration to reconcile any existing rows that violate this rule before applying
the constraint.
- Around line 522-535: The ssrf_violations table allows inconsistent rows (e.g.
status='pending' with resolver fields set), causing
SQLiteSsrfViolationRepository.list_violations() to drop them during
deserialization; add a CHECK constraint to the ssrf_violations CREATE TABLE that
enforces: when status = 'pending' then resolved_by IS NULL AND resolved_at IS
NULL, and when status IN ('allowed','denied') then resolved_by IS NOT NULL AND
resolved_at IS NOT NULL, so only valid pending/allowed/denied states can be
stored.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py`:
- Around line 143-160: The get() method currently lets _row_to_violation(...)
raise ValueError/ValidationError out of the repository; catch these exceptions
around the call to _row_to_violation in get(), log a contextual warning or error
using logger with PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED (including the
violation_id and original exception message), and then raise a PersistenceError
wrapping the original exception so malformed rows follow the repository error
contract; ensure you catch both ValueError and ValidationError and preserve
exception chaining (raise PersistenceError(...) from exc).

In `@src/synthorg/security/rules/risk_override.py`:
- Line 29: The code instantiates NotBlankStr as if it were callable; change the
constant _DEFAULT_REVOKED_BY to a plain string instead of calling
NotBlankStr—i.e., replace _DEFAULT_REVOKED_BY = NotBlankStr("system") with
_DEFAULT_REVOKED_BY = "system" so that type validation still occurs when
assigning to NotBlankStr-annotated fields like revoked_by in the related
classes/functions.

In `@src/synthorg/settings/subscribers/security_subscriber.py`:
- Around line 100-106: If applying the new allowlist fails, catch exceptions
from the call to self._on_changed(allowlist) in apply/update handler and
emit/log a SECURITY_ALLOWLIST_UPDATE_FAILED event with namespace, key and error
context before re-raising; specifically wrap the call to self._on_changed in a
try/except, log logger.error or logger.warning with
SECURITY_ALLOWLIST_UPDATE_FAILED, namespace=namespace, key=key,
count=len(allowlist) and the exception details, then re-raise so the existing
success path that logs SECURITY_ALLOWLIST_UPDATED remains unchanged.
- Around line 81-90: The code currently coerces every item in entries to strings
(tuple(str(e) for e in entries)), which accepts numbers/null/objects as valid
allowlist entries; change the validation in the callback that processes entries
so it rejects non-string or blank items instead of coercing them: verify entries
is a list, ensure each item is an instance of str and non-empty and matches the
expected "host:port" shape (or at least contains a colon and non-empty host and
port substrings) before constructing allowlist, and if any item is invalid log
SECURITY_ALLOWLIST_UPDATE_FAILED with namespace and key (using logger) and
return (fail closed) rather than creating allowlist; update references to
entries, allowlist, SECURITY_ALLOWLIST_UPDATE_FAILED, namespace, key, and logger
in security_subscriber.py accordingly.

In `@src/synthorg/tools/sandbox/auth_proxy.py`:
- Around line 67-73: stop() currently clears _port and _url without logging;
update stop() to check if self._port != 0 and, when true, emit an INFO-level log
about the state transition (e.g., include the existing port and/or url) before
setting self._port = 0 and self._url = ""; only log when a running proxy was
actually stopped (do nothing/logging when _port is already 0). Use the class's
logger instance for the INFO message and keep behavior safe to call when not
started.
- Around line 61-64: Replace the hard-coded event string in the logger call with
a constant: define AUTH_PROXY_NOT_IMPLEMENTED =
"sandbox.auth_proxy.not_implemented" in synthorg.observability.events.sandbox
(or add it to an appropriate events package module), import that constant into
src/synthorg/tools/sandbox/auth_proxy.py, and pass AUTH_PROXY_NOT_IMPLEMENTED to
logger.warning instead of the string literal (the call to update is the
logger.warning(...) invocation currently using the string).

In `@tests/unit/api/fakes_backend.py`:
- Around line 108-126: The fake update_status currently allows any status
transition; change it to mirror production by (1) rejecting attempts to set
status to SsrfViolationStatus.PENDING (return False) and (2) only applying the
update when the stored violation's current v.status is
SsrfViolationStatus.PENDING (otherwise return False). Update the async
update_status function in tests/unit/api/fakes_backend.py to check v =
self._violations.get(violation_id), then if status ==
SsrfViolationStatus.PENDING return False, if v.status !=
SsrfViolationStatus.PENDING return False, and only then perform the model_copy
update of "status", "resolved_by", and "resolved_at".

In `@tests/unit/tools/sandbox/test_runtime_resolver.py`:
- Around line 130-162: Tests currently monkeypatch
SandboxRuntimeResolver.probe_available_runtimes itself, so they don't exercise
the real probe logic; change the tests to mock the underlying aiodocker.Docker
behavior instead and call SandboxRuntimeResolver.probe_available_runtimes to
verify real behavior. Specifically, replace the existing monkeypatch of
probe_available_runtimes with monkeypatch.setattr("aiodocker.Docker", <mock>)
where the mock implements system.info() returning {"Runtimes": {...}} and
close(), assert the returned frozenset matches discovered runtimes; also add
tests where system.info() lacks "Runtimes" to assert fallback
frozenset({"runc"}) and where constructing/calling aiodocker.Docker raises to
assert a warning is logged and the same fallback is returned, referencing
SandboxRuntimeResolver.probe_available_runtimes, aiodocker.Docker, system.info,
and logging behavior.

---

Duplicate comments:
In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py`:
- Around line 162-171: In list_violations, validate and cap the incoming limit
to a safe maximum and log invalid values before raising: update
SsrfViolationRepo.list_violations to check for limit <= 0 and for limit >
MAX_LIMIT (define a constant MAX_LIMIT or use an existing one), call the logger
(at WARNING) with context including the provided limit and the caller intent,
then either raise ValueError for non-positive limits or clamp/raise for
too-large limits per policy; reference the method name list_violations and use
the repository's logger instance when adding the log calls.

In `@src/synthorg/security/ssrf_violation.py`:
- Line 9: The validator currently parses URLs with urlparse but only checks for
userinfo in netloc; extend it to inspect query parameters too by using
urllib.parse.parse_qsl (or parse_qs) to extract query keys and check them
against the existing secret-key blacklist (e.g., token, api_key, password,
secret, key, auth, access_token, bearer); if any blacklisted key is present,
reject the URL or redact the value before persistence the same way the code
handles user:pass@host. Update the URL validation function in
src/synthorg/security/ssrf_violation.py (the same routine that handles the
netloc/userinfo check; also adjust the logic referenced around the current check
at lines ~54-62) to perform this query-key inspection and return/raise the same
failure path used for userinfo detections.

In `@src/synthorg/tools/sandbox/docker_sandbox.py`:
- Around line 461-466: _run_container calls _build_container_config without
forwarding the category, causing _resolve_runtime's category check to fail and
bypass the runtime resolver; propagate the category from execute() through
_run_container() into _build_container_config() and then into _resolve_runtime()
by adding a category parameter to _run_container and ensuring
_build_container_config accepts and forwards category to whichever call invokes
_resolve_runtime() (references: execute, _run_container,
_build_container_config, _resolve_runtime, category).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0d2d6fe6-b27f-43df-828f-9c7951ff4109

📥 Commits

Reviewing files that changed from the base of the PR and between 767d5ec and 567bb29.

📒 Files selected for processing (12)
  • src/synthorg/observability/events/security.py
  • src/synthorg/persistence/protocol.py
  • src/synthorg/persistence/sqlite/schema.sql
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/security/rules/risk_override.py
  • src/synthorg/security/ssrf_violation.py
  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/tools/sandbox/docker_sandbox.py
  • src/synthorg/tools/sandbox/runtime_resolver.py
  • tests/unit/api/fakes_backend.py
  • tests/unit/tools/sandbox/test_runtime_resolver.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Backend
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Web
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: except A, B: (no parentheses)—ruff enforces this on Python 3.14.
Type hints: all public functions required, mypy strict mode enforced.
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules).
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens).
Use NotBlankStr (from core.types) for all identifier/name fields—including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants—instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff enforced).
Functions: < 50 lines, files < 800 lines.
Handle errors explicitly, never silently swallow.
Validate at system boundaries (user input, external APIs, config files).
Variable name: always logger (not _logger, not log).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO.
DEBUG for object creation, internal flow, entry/exit of key functions.
Pure data models, enums, and re-exports do NOT need logging.

Files:

  • src/synthorg/persistence/protocol.py
  • src/synthorg/tools/sandbox/docker_sandbox.py
  • src/synthorg/tools/sandbox/auth_proxy.py
  • tests/unit/tools/sandbox/test_runtime_resolver.py
  • src/synthorg/settings/subscribers/security_subscriber.py
  • tests/unit/api/fakes_backend.py
  • src/synthorg/tools/sandbox/runtime_resolver.py
  • src/synthorg/security/rules/risk_override.py
  • src/synthorg/security/ssrf_violation.py
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/observability/events/security.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(name)
Never use import logging / logging.getLogger() / print() in application code. Exception: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, and observability/http_handler.py may use stdlib logging and print(..., file=sys.stderr).
Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api). Import directly: from synthorg.observability.events. import EVENT_CONSTANT
Structured kwargs: always logger.info(EVENT, key=value)—never logger.info("msg %s", val)
All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig.
Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError.
Non-retryable errors raise immediately without retry.
RetryExhaustedError signals that all retries failed—the engine layer catches this to trigger fallback chains.
Rate limiter respects RateLimitError.retry_after from providers—automatically pauses future requests.

Files:

  • src/synthorg/persistence/protocol.py
  • src/synthorg/tools/sandbox/docker_sandbox.py
  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/tools/sandbox/runtime_resolver.py
  • src/synthorg/security/rules/risk_override.py
  • src/synthorg/security/ssrf_violation.py
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/observability/events/security.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/persistence/protocol.py
  • src/synthorg/tools/sandbox/docker_sandbox.py
  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/tools/sandbox/runtime_resolver.py
  • src/synthorg/security/rules/risk_override.py
  • src/synthorg/security/ssrf_violation.py
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/observability/events/security.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow.
Async: asyncio_mode = "auto"—no manual @pytest.mark.asyncio needed.
Timeout: 30 seconds per test (global in pyproject.toml—do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed).
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases.
Tests must use test-provider, test-small-001, etc. (not real vendor names).
Property-based testing: Python uses Hypothesis (@given + @settings).
Hypothesis profiles configured in tests/conftest.py: ci (deterministic, max_examples=10 + derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline). Controlled via HYPOTHESIS_PROFILE env var.
Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
  • tests/unit/api/fakes_backend.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
  • tests/unit/api/fakes_backend.py
🧠 Learnings (47)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Security scanning: gitleaks (push/PR + weekly), zizmor (workflow analysis), OSSF Scorecard (weekly), Socket.dev (PR supply chain), ZAP DAST (weekly + manual).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-06T16:35:12.934Z
Learning: Agents in the Parallel tier (Haiku-class) include 14 reviewers: async-concurrency-reviewer, comment-analyzer, design-token-audit, docs-consistency, go-conventions-enforcer, go-security-reviewer, infra-reviewer, issue-resolution-verifier, persistence-reviewer, pr-test-analyzer, security-reviewer, silent-failure-hunter, test-quality-reviewer, and type-design-analyzer
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Dependabot: daily updates (uv, github-actions, npm, pre-commit, docker, gomod), all updates grouped into 1 PR per ecosystem, no auto-merge. Use /review-dep-pr before merging.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T21:49:53.264Z
Learning: Fix everything valid — never skip when review agents find valid issues (including pre-existing issues in surrounding code, suggestions, and findings adjacent to the PR's changes). No deferring, no 'out of scope' skipping.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: After the PR exists, use /aurelio-review-pr to handle external reviewer feedback.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-06T16:35:12.934Z
Learning: Agents in the Quality tier (Sonnet-class) include: code-reviewer, python-reviewer, frontend-reviewer, go-reviewer, conventions-enforcer, logging-audit, resilience-audit, and api-contract-drift
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T05:46:17.064Z
Learning: Fix all valid issues found by review agents (including pre-existing issues in surrounding code, suggestions, and adjacent findings) — never skip or defer
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T21:32:02.880Z
Learning: Applies to .github/workflows/docker.yml : Docker workflow: builds backend + web + sandbox images, pushes to GHCR, signs with cosign. SLSA L3 provenance attestations via actions/attest-build-provenance. Scans: Trivy (CRITICAL = hard fail, HIGH = warn) + Grype (critical cutoff) + CIS Docker Benchmark v1.6.0 compliance (informational). CVE triage via .github/.trivyignore.yaml and .github/.grype.yaml. Images only pushed after scans pass. Triggers on push to main and version tags (v*).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Docker: build + Trivy/Grype scan + push to GHCR + cosign sign + SLSA L3 provenance. CVE triage: .github/.trivyignore.yaml, .github/.grype.yaml.
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/persistence/**/*.py : Persistence uses pluggable PersistenceBackend protocol. SQLite is the initial backend. Settings use SettingsRepository (namespaced settings CRUD).

Applied to files:

  • src/synthorg/persistence/protocol.py
  • tests/unit/api/fakes_backend.py
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Persistence backend: pluggable PersistenceBackend protocol in `src/synthorg/persistence/`, SQLite initial, SettingsRepository (namespaced settings CRUD).

Applied to files:

  • src/synthorg/persistence/protocol.py
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

  • src/synthorg/persistence/protocol.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docker/Dockerfile.sandbox : Docker sandbox: `synthorg-sandbox` — Python 3.14 + Node.js + git, non-root (UID 10001), agent code execution sandbox

Applied to files:

  • src/synthorg/tools/sandbox/docker_sandbox.py
  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/tools/sandbox/runtime_resolver.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/settings/subscribers/security_subscriber.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to tests/**/*.py : Never skip, dismiss, or ignore flaky tests; fix them fully and fundamentally by mocking `time.monotonic()` and `asyncio.sleep()` for determinism or using `asyncio.Event().wait()` for indefinite blocking

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-04-06T20:42:41.219Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Applies to tests/**/*.py : Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mock `time.monotonic()` and `asyncio.sleep()` for determinism; use `asyncio.Event().wait()` for indefinite blocking instead of `asyncio.sleep(large_number)`

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to tests/**/*.py : Flaky tests: NEVER skip, dismiss, or ignore flaky tests -- always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/settings/**/*.py : Settings package (settings/): runtime-editable settings persistence (DB > env > YAML > code defaults), typed definitions (9 namespaces), Fernet encryption for sensitive values, config bridge (JSON serialization for Pydantic/collections), ConfigResolver (typed accessors), validation, registry, change notifications via message bus, SettingsSubscriber protocol, SettingsChangeDispatcher (polls `#settings` channel, routes to subscribers, restart_required filtering)

Applied to files:

  • src/synthorg/settings/subscribers/security_subscriber.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)

Applied to files:

  • src/synthorg/settings/subscribers/security_subscriber.py
  • tests/unit/api/fakes_backend.py
  • src/synthorg/security/rules/risk_override.py
  • src/synthorg/security/ssrf_violation.py
  • src/synthorg/observability/events/security.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.

Applied to files:

  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/security/rules/risk_override.py
  • src/synthorg/observability/events/security.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions.

Applied to files:

  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/security/rules/risk_override.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, and key function entry/exit

Applied to files:

  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/security/rules/risk_override.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)

Applied to files:

  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/observability/events/security.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and wrap with `MappingProxyType` for read-only enforcement

Applied to files:

  • tests/unit/api/fakes_backend.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).

Applied to files:

  • src/synthorg/security/rules/risk_override.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves

Applied to files:

  • src/synthorg/security/rules/risk_override.py
  • src/synthorg/security/ssrf_violation.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.

Applied to files:

  • src/synthorg/security/rules/risk_override.py
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.

Applied to files:

  • src/synthorg/security/rules/risk_override.py
📚 Learning: 2026-03-31T20:07:03.035Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T20:07:03.035Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising; all state transitions must log at INFO; DEBUG for object creation, internal flow, entry/exit of key functions

Applied to files:

  • src/synthorg/security/rules/risk_override.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO level

Applied to files:

  • src/synthorg/security/rules/risk_override.py
📚 Learning: 2026-04-01T14:22:06.315Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T14:22:06.315Z
Learning: Applies to {**/*.py,web/src/**/*.{ts,tsx}} : Validate at system boundaries (user input, external APIs, config files)

Applied to files:

  • src/synthorg/security/ssrf_violation.py
📚 Learning: 2026-03-16T20:14:00.937Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T20:14:00.937Z
Learning: Applies to **/*.py : Validate: at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/security/ssrf_violation.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to **/*.py : Validate at system boundaries (user input, external APIs, config files)

Applied to files:

  • src/synthorg/security/ssrf_violation.py
📚 Learning: 2026-04-06T20:42:41.219Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Applies to **/*.py : Validate at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/security/ssrf_violation.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Handle errors explicitly, never silently swallow. Validate at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/security/ssrf_violation.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to **/*.py : Use `except A, B:` (no parentheses) per PEP 758 exception syntax on Python 3.14

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Use PEP 758 except syntax with `except A, B:` (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) per PEP 758 for exception handling in Python 3.14

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-04-06T20:42:41.219Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses)—ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Handle errors explicitly; never silently swallow exceptions

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/security.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.

Applied to files:

  • src/synthorg/observability/events/security.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly from the domain module

Applied to files:

  • src/synthorg/observability/events/security.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/security.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/security.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.

Applied to files:

  • src/synthorg/observability/events/security.py
📚 Learning: 2026-03-31T16:09:24.320Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:09:24.320Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly and use in structured logging

Applied to files:

  • src/synthorg/observability/events/security.py
🔇 Additional comments (10)
src/synthorg/tools/sandbox/auth_proxy.py (2)

1-16: LGTM!

Module docstring is clear and logger setup follows the required pattern using get_logger(__name__).


19-42: LGTM!

Class design is clean with proper encapsulation via properties, type hints, and Google-style docstrings.

src/synthorg/tools/sandbox/runtime_resolver.py (2)

106-132: LGTM — fallback logic correctly falls through to global runtime.

The resolution logic properly implements the documented 3-step fallback:

  1. Per-category override (if available on host)
  2. Global runtime (if available on host)
  3. None for Docker default

The fall-through at line 117 (instead of early return) correctly allows unavailable per-category overrides to try the global runtime before giving up.


48-85: LGTM — robust daemon probing with proper cleanup.

The probe correctly:

  • Uses try/finally to ensure client cleanup
  • Falls back to {"runc"} on any daemon communication failure
  • Logs availability status at appropriate levels (INFO for success/known-unavailable, WARNING for daemon-unreachable)
tests/unit/tools/sandbox/test_runtime_resolver.py (1)

11-91: LGTM — comprehensive coverage of resolve_runtime() logic.

The test cases thoroughly cover:

  • Override available → returns override
  • Override unavailable, no global → returns None
  • Override unavailable, global available → falls through to global (line 34-44 validates the fix)
  • Global available/unavailable scenarios
  • Precedence between override and global
  • Categories without overrides using global
src/synthorg/tools/sandbox/docker_sandbox.py (3)

232-238: LGTM — credential sanitization correctly integrated.

Sanitization is applied before env validation, and SandboxCredentialManager.sanitize_env handles logging of stripped keys internally (per credential_manager.py:64-83).


136-145: LGTM — resolver attachment API is clean.

The setter allows late-binding of the resolver after construction, which is appropriate since probing is async and cannot occur in __init__.


306-314: Logic is correct, contingent on category being passed.

The fallback to self._config.runtime when category is empty is reasonable for backward compatibility. Once the category propagation issue (lines 461-466) is fixed, this method will correctly delegate to the resolver.

src/synthorg/observability/events/security.py (1)

79-90: LGTM.

These new event names stay consistent with the existing security taxonomy and give the override/allowlist flows distinct, searchable lifecycle events.

src/synthorg/persistence/protocol.py (1)

34-39: LGTM.

The protocol surface, imports, and docstring now stay in sync for the two new repositories.

Also applies to: 85-86, 247-255

Comment on lines +502 to +513
CREATE TABLE IF NOT EXISTS risk_overrides (
id TEXT PRIMARY KEY,
action_type TEXT NOT NULL,
original_tier TEXT NOT NULL,
override_tier TEXT NOT NULL,
reason TEXT NOT NULL,
created_by TEXT NOT NULL,
created_at TEXT NOT NULL,
expires_at TEXT NOT NULL,
revoked_at TEXT,
revoked_by TEXT
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep risk_overrides revocation fields paired at the DB layer.

RiskTierOverride already requires revoked_at and revoked_by together, but this table accepts one without the other. That weakens the auditable-override guarantee for any raw SQL write or migration.

🧱 Proposed fix
 CREATE TABLE IF NOT EXISTS risk_overrides (
     id TEXT PRIMARY KEY,
     action_type TEXT NOT NULL,
     original_tier TEXT NOT NULL,
     override_tier TEXT NOT NULL,
     reason TEXT NOT NULL,
     created_by TEXT NOT NULL,
     created_at TEXT NOT NULL,
     expires_at TEXT NOT NULL,
     revoked_at TEXT,
-    revoked_by TEXT
+    revoked_by TEXT,
+    CHECK (
+        (revoked_at IS NULL AND revoked_by IS NULL)
+        OR
+        (revoked_at IS NOT NULL AND revoked_by IS NOT NULL)
+    )
 );
As per coding guidelines: "Validate at system boundaries (user input, external APIs, config files)."
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
CREATE TABLE IF NOT EXISTS risk_overrides (
id TEXT PRIMARY KEY,
action_type TEXT NOT NULL,
original_tier TEXT NOT NULL,
override_tier TEXT NOT NULL,
reason TEXT NOT NULL,
created_by TEXT NOT NULL,
created_at TEXT NOT NULL,
expires_at TEXT NOT NULL,
revoked_at TEXT,
revoked_by TEXT
);
CREATE TABLE IF NOT EXISTS risk_overrides (
id TEXT PRIMARY KEY,
action_type TEXT NOT NULL,
original_tier TEXT NOT NULL,
override_tier TEXT NOT NULL,
reason TEXT NOT NULL,
created_by TEXT NOT NULL,
created_at TEXT NOT NULL,
expires_at TEXT NOT NULL,
revoked_at TEXT,
revoked_by TEXT,
CHECK (
(revoked_at IS NULL AND revoked_by IS NULL)
OR
(revoked_at IS NOT NULL AND revoked_by IS NOT NULL)
)
);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/schema.sql` around lines 502 - 513, The
risk_overrides table allows revoked_at and revoked_by to be set independently,
but the RiskTierOverride model requires them paired; add a DB-level constraint
on the risk_overrides table to enforce they are either both NULL or both NOT
NULL (e.g. a CHECK constraint like (revoked_at IS NULL AND revoked_by IS NULL)
OR (revoked_at IS NOT NULL AND revoked_by IS NOT NULL)), update the CREATE TABLE
for risk_overrides to include that CHECK, and add a migration to reconcile any
existing rows that violate this rule before applying the constraint.


logger = get_logger(__name__)

_DEFAULT_REVOKED_BY = NotBlankStr("system")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n -C3 'NotBlankStr\s*=' src/synthorg/core
fd -a 'types.py' src/synthorg/core | xargs -r sed -n '1,220p'
rg -n -C2 '_DEFAULT_REVOKED_BY' src/synthorg/security/rules/risk_override.py

Repository: Aureliolo/synthorg

Length of output: 14151


Remove NotBlankStr("system") constructor call—NotBlankStr is an Annotated type alias, not a class.

At line 29, NotBlankStr("system") will fail at import time because Annotated is not callable. Replace with:

_DEFAULT_REVOKED_BY = "system"

Type validation still applies when _DEFAULT_REVOKED_BY is assigned to NotBlankStr-annotated fields (e.g., revoked_by: NotBlankStr = _DEFAULT_REVOKED_BY); Pydantic enforces the constraints at those points.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/security/rules/risk_override.py` at line 29, The code
instantiates NotBlankStr as if it were callable; change the constant
_DEFAULT_REVOKED_BY to a plain string instead of calling NotBlankStr—i.e.,
replace _DEFAULT_REVOKED_BY = NotBlankStr("system") with _DEFAULT_REVOKED_BY =
"system" so that type validation still occurs when assigning to
NotBlankStr-annotated fields like revoked_by in the related classes/functions.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py`:
- Around line 124-131: The except clause in the SSRF violation save routine uses
tuple-style exceptions; change the clause to PEP 758 form so it reads "except
sqlite3.Error, aiosqlite.Error as exc:" (removing the parentheses and comma
inside them) while keeping the body intact (call self._rollback_quietly(), build
msg, logger.exception with PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED, and raise
PersistenceError from exc); update any other similar except clauses in the same
module (e.g., in methods of the SSRFViolationRepo class or nearby functions) to
the same PEP 758 syntax.
- Around line 252-259: Replace the tuple-style except clause with PEP 758
comma-separated syntax: change the line "except (sqlite3.Error, aiosqlite.Error)
as exc:" to "except sqlite3.Error, aiosqlite.Error as exc:" in the SSRF
violation update block so the rollback call (self._rollback_quietly()),
logger.exception(PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED, error=msg), and raise
PersistenceError(msg) from exc remain unchanged.
- Around line 198-204: Replace the parenthesized except tuple with PEP 758
syntax in the SSRF violation DB error handler: change the clause currently
written as "except (sqlite3.Error, aiosqlite.Error) as exc:" to the PEP 758 form
"except sqlite3.Error, aiosqlite.Error as exc:" in the block that logs
PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED and raises PersistenceError so the
exception is captured in exc without parentheses.
- Around line 150-156: The except clause in ssrf_violation_repo.py currently
uses parentheses for multiple exceptions; change the clause that now says
"except (sqlite3.Error, aiosqlite.Error) as exc" to the PEP 758 form without
parentheses by listing sqlite3.Error and aiosqlite.Error without surrounding
parentheses while keeping the exception binding as exc so the existing
logger.exception(PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED, error=msg) and raise
PersistenceError(msg) from exc behavior remains unchanged.
- Around line 162-169: Replace the current tuple-style except clause "except
(ValueError, ValidationError) as exc:" with the PEP 758 style "except
ValueError, ValidationError as exc:" in the SSRF deserialization error handling
block (the block that logs PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED and raises
PersistenceError using violation_id and msg); keep the logger.exception call and
the subsequent "raise PersistenceError(msg) from exc" unchanged.

In `@src/synthorg/settings/subscribers/security_subscriber.py`:
- Around line 101-108: Replace the parenthesized exception clause in the
try/except handling for parsing the allowlist: change "except
(json.JSONDecodeError, TypeError) as exc:" to the PEP 758 form "except
json.JSONDecodeError, TypeError as exc:" in the block that logs via
logger.warning with SECURITY_ALLOWLIST_UPDATE_FAILED (keeping namespace, key and
error=f"failed to parse allowlist: {exc}" unchanged).
- Around line 78-81: The try block assumes SettingsService.get() may return None
but it actually raises SettingNotFoundError on missing keys; remove the dead
fallback or handle the exception explicitly: either simplify by assigning raw =
setting.value after await self._settings_service.get(namespace, key) if the
setting is guaranteed to exist, or catch SettingNotFoundError around the await
(referencing SettingsService.get and SettingNotFoundError) and log/handle the
error and set raw = "[]" before calling json.loads(raw) so entries is computed
safely; update the block around _settings_service.get, raw, and entries
accordingly.

In `@tests/unit/api/fakes_backend.py`:
- Around line 54-55: The FakeRiskOverrideRepository.save implementation
currently overwrites existing entries; update async def save(self, override) in
tests/unit/api/fakes_backend.py to first check if override.id already exists in
self._overrides and, if so, raise DuplicateRecordError (the same exception type
declared by the RiskOverrideRepository protocol), otherwise store the override
in self._overrides as before; reference DuplicateRecordError and the
RiskOverrideRepository contract to ensure behavior matches the protocol.
- Around line 88-89: The fake backend's async save(self, violation:
SsrfViolation) currently overwrites entries in self._violations; change it to
check for an existing key and raise DuplicateRecordError instead of overwriting
when violation.id already exists. Locate the save method and use the same
DuplicateRecordError type used by the repo contract, referencing
self._violations and violation.id, and ensure the method raises that error for
duplicates and only inserts when the id is absent.

In `@tests/unit/tools/sandbox/test_runtime_resolver.py`:
- Around line 24-32: Rename or update the test so its name matches the asserted
behavior: either rename test_falls_back_to_runc_when_override_unavailable to
something like test_returns_none_when_override_unavailable (since the assertion
expects None), or modify the test to set a global runtime on DockerSandboxConfig
and then assert resolver.resolve_runtime("code_execution") == "runc"; locate the
test method, the DockerSandboxConfig instantiation, and the
SandboxRuntimeResolver.resolve_runtime call to apply the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d55c8997-24f2-4e49-bcfb-83eb041d021e

📥 Commits

Reviewing files that changed from the base of the PR and between 567bb29 and 2b70cb4.

📒 Files selected for processing (7)
  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/persistence/sqlite/schema.sql
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/tools/sandbox/auth_proxy.py
  • tests/unit/api/fakes_backend.py
  • tests/unit/tools/sandbox/test_runtime_resolver.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Web
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Backend
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: except A, B: (no parentheses)—ruff enforces this on Python 3.14.
Type hints: all public functions required, mypy strict mode enforced.
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules).
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens).
Use NotBlankStr (from core.types) for all identifier/name fields—including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants—instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff enforced).
Functions: < 50 lines, files < 800 lines.
Handle errors explicitly, never silently swallow.
Validate at system boundaries (user input, external APIs, config files).
Variable name: always logger (not _logger, not log).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO.
DEBUG for object creation, internal flow, entry/exit of key functions.
Pure data models, enums, and re-exports do NOT need logging.

Files:

  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/settings/subscribers/security_subscriber.py
  • tests/unit/api/fakes_backend.py
  • src/synthorg/observability/events/sandbox.py
  • tests/unit/tools/sandbox/test_runtime_resolver.py
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(name)
Never use import logging / logging.getLogger() / print() in application code. Exception: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, and observability/http_handler.py may use stdlib logging and print(..., file=sys.stderr).
Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api). Import directly: from synthorg.observability.events. import EVENT_CONSTANT
Structured kwargs: always logger.info(EVENT, key=value)—never logger.info("msg %s", val)
All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig.
Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError.
Non-retryable errors raise immediately without retry.
RetryExhaustedError signals that all retries failed—the engine layer catches this to trigger fallback chains.
Rate limiter respects RateLimitError.retry_after from providers—automatically pauses future requests.

Files:

  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow.
Async: asyncio_mode = "auto"—no manual @pytest.mark.asyncio needed.
Timeout: 30 seconds per test (global in pyproject.toml—do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed).
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases.
Tests must use test-provider, test-small-001, etc. (not real vendor names).
Property-based testing: Python uses Hypothesis (@given + @settings).
Hypothesis profiles configured in tests/conftest.py: ci (deterministic, max_examples=10 + derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline). Controlled via HYPOTHESIS_PROFILE env var.
Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Files:

  • tests/unit/api/fakes_backend.py
  • tests/unit/tools/sandbox/test_runtime_resolver.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/api/fakes_backend.py
  • tests/unit/tools/sandbox/test_runtime_resolver.py
🧠 Learnings (42)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Security scanning: gitleaks (push/PR + weekly), zizmor (workflow analysis), OSSF Scorecard (weekly), Socket.dev (PR supply chain), ZAP DAST (weekly + manual).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-06T16:35:12.934Z
Learning: Agents in the Parallel tier (Haiku-class) include 14 reviewers: async-concurrency-reviewer, comment-analyzer, design-token-audit, docs-consistency, go-conventions-enforcer, go-security-reviewer, infra-reviewer, issue-resolution-verifier, persistence-reviewer, pr-test-analyzer, security-reviewer, silent-failure-hunter, test-quality-reviewer, and type-design-analyzer
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Dependabot: daily updates (uv, github-actions, npm, pre-commit, docker, gomod), all updates grouped into 1 PR per ecosystem, no auto-merge. Use /review-dep-pr before merging.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T21:32:02.880Z
Learning: Applies to .github/workflows/docker.yml : Docker workflow: builds backend + web + sandbox images, pushes to GHCR, signs with cosign. SLSA L3 provenance attestations via actions/attest-build-provenance. Scans: Trivy (CRITICAL = hard fail, HIGH = warn) + Grype (critical cutoff) + CIS Docker Benchmark v1.6.0 compliance (informational). CVE triage via .github/.trivyignore.yaml and .github/.grype.yaml. Images only pushed after scans pass. Triggers on push to main and version tags (v*).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T21:49:53.264Z
Learning: Fix everything valid — never skip when review agents find valid issues (including pre-existing issues in surrounding code, suggestions, and findings adjacent to the PR's changes). No deferring, no 'out of scope' skipping.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T05:46:17.064Z
Learning: Fix all valid issues found by review agents (including pre-existing issues in surrounding code, suggestions, and adjacent findings) — never skip or defer
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-06T16:35:12.934Z
Learning: Agents in the Quality tier (Sonnet-class) include: code-reviewer, python-reviewer, frontend-reviewer, go-reviewer, conventions-enforcer, logging-audit, resilience-audit, and api-contract-drift
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docker/Dockerfile.sandbox : Docker sandbox: `synthorg-sandbox` — Python 3.14 + Node.js + git, non-root (UID 10001), agent code execution sandbox

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/settings/subscribers/security_subscriber.py
📚 Learning: 2026-03-31T16:09:24.320Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:09:24.320Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly and use in structured logging

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly rather than using string literals

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/observability/events/sandbox.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Use event constants from `synthorg.observability.events.<domain>` (e.g., `API_REQUEST_STARTED` from `events.api`); import directly and log with structured kwargs: `logger.info(EVENT, key=value)`, never interpolated strings

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
📚 Learning: 2026-04-06T20:42:41.219Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly from the domain module

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/observability/events/sandbox.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/observability/events/sandbox.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to src/**/*.py : All state transitions must log at INFO level

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
📚 Learning: 2026-04-06T20:42:41.219Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Applies to **/*.py : All state transitions must log at INFO.

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO level

Applied to files:

  • src/synthorg/tools/sandbox/auth_proxy.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/settings/**/*.py : Settings package (settings/): runtime-editable settings persistence (DB > env > YAML > code defaults), typed definitions (9 namespaces), Fernet encryption for sensitive values, config bridge (JSON serialization for Pydantic/collections), ConfigResolver (typed accessors), validation, registry, change notifications via message bus, SettingsSubscriber protocol, SettingsChangeDispatcher (polls `#settings` channel, routes to subscribers, restart_required filtering)

Applied to files:

  • src/synthorg/settings/subscribers/security_subscriber.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)

Applied to files:

  • src/synthorg/settings/subscribers/security_subscriber.py
  • tests/unit/api/fakes_backend.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.

Applied to files:

  • src/synthorg/settings/subscribers/security_subscriber.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and wrap with `MappingProxyType` for read-only enforcement

Applied to files:

  • tests/unit/api/fakes_backend.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/persistence/**/*.py : Persistence uses pluggable PersistenceBackend protocol. SQLite is the initial backend. Settings use SettingsRepository (namespaced settings CRUD).

Applied to files:

  • tests/unit/api/fakes_backend.py
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization) for `dict`/`list` fields

Applied to files:

  • tests/unit/api/fakes_backend.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)

Applied to files:

  • src/synthorg/observability/events/sandbox.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to tests/**/*.py : Never skip, dismiss, or ignore flaky tests; fix them fully and fundamentally by mocking `time.monotonic()` and `asyncio.sleep()` for determinism or using `asyncio.Event().wait()` for indefinite blocking

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-04-06T20:42:41.219Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Applies to tests/**/*.py : Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mock `time.monotonic()` and `asyncio.sleep()` for determinism; use `asyncio.Event().wait()` for indefinite blocking instead of `asyncio.sleep(large_number)`

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to tests/**/*.py : Flaky tests: NEVER skip, dismiss, or ignore flaky tests -- always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Persistence backend: pluggable PersistenceBackend protocol in `src/synthorg/persistence/`, SQLite initial, SettingsRepository (namespaced settings CRUD).

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to **/*.py : Use `except A, B:` (no parentheses) per PEP 758 exception syntax on Python 3.14

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Use PEP 758 except syntax with `except A, B:` (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) per PEP 758 for exception handling in Python 3.14

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-04-06T20:42:41.219Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses)—ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-15T16:55:07.730Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T16:55:07.730Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Handle errors explicitly; never silently swallow exceptions

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.

Applied to files:

  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
🔇 Additional comments (20)
src/synthorg/observability/events/sandbox.py (1)

24-30: LGTM! New event constants follow established patterns.

The new constants are:

  • Correctly typed as Final[str]
  • Follow the existing sandbox.<domain>.<action> naming convention
  • Already consumed correctly by runtime_resolver.py, credential_manager.py, and auth_proxy.py per the provided context snippets
src/synthorg/tools/sandbox/auth_proxy.py (4)

1-20: LGTM! Module setup follows project conventions.

  • Module docstring clearly explains the proxy's purpose and future implementation plans
  • Logger correctly initialized via get_logger(__name__)
  • Event constants properly imported from the domain-specific module

23-46: LGTM! Clean class design with proper encapsulation.

  • Private state (_port, _url) with read-only property accessors
  • Clear sentinel values documented (0 for port, empty string for URL)
  • Google-style docstrings on class and properties

48-69: LGTM! Error path correctly logs before raising.

The start() method properly logs at WARNING level with the event constant SANDBOX_AUTH_PROXY_NOT_IMPLEMENTED and includes context (port=port) before raising NotImplementedError. This follows the coding guideline requiring error paths to log with context before raising.


71-82: LGTM! State transition correctly logged at INFO level.

The stop() method:

  • Only logs when an actual state transition occurs (_port != 0)
  • Uses the SANDBOX_AUTH_PROXY_STOPPED event constant
  • Includes context (port=self._port)
  • Is safe to call when not started (idempotent)

This follows the coding guideline requiring state transitions to log at INFO.

tests/unit/tools/sandbox/test_runtime_resolver.py (3)

130-184: LGTM! Probe tests now exercise actual aiodocker integration logic.

The tests correctly mock aiodocker.Docker at the aiodocker module level rather than patching probe_available_runtimes itself. This validates the real probe implementation including:

  • Parsing Runtimes from Docker daemon info
  • Fallback on missing Runtimes key
  • Fallback on connection errors

The mock structure with _MockSystem.info() returning runtime data and _MockDocker.close() as async method properly simulates the aiodocker client lifecycle.


11-91: LGTM! Comprehensive coverage of resolve_runtime scenarios.

The test class thoroughly covers the runtime resolution logic:

  • Override available/unavailable with global fallback
  • Global runtime available/unavailable
  • Override precedence over global
  • Category-specific behavior

Test cases align well with the production code's _resolve_runtime delegation pattern shown in the context snippet.


93-124: LGTM! Factory default tests validate gVisor override behavior.

Tests confirm that high-risk categories (code_execution, terminal) respect both factory gVisor defaults and user-provided overrides, correctly verifying the precedence rules.

src/synthorg/settings/subscribers/security_subscriber.py (3)

1-25: LGTM!

Module setup follows all guidelines: proper logger initialization via get_logger(__name__), event constants imported from domain-specific module, and TYPE_CHECKING used correctly for type-only imports.


28-59: LGTM!

The class correctly implements the SettingsSubscriber protocol with proper type hints and docstrings. The keyword-only constructor pattern prevents positional argument errors.


110-125: LGTM!

Callback error handling correctly logs the exception with full context before re-raising, and the success path logs the state transition at INFO level with the allowlist count. This properly addresses the observability requirements.

tests/unit/api/fakes_backend.py (2)

116-118: LGTM — ValueError behavior matches SQLite implementation.

The ValueError for PENDING transitions matches the production SQLiteSsrfViolationRepository.update_status() implementation (lines 233-235 in src/synthorg/persistence/sqlite/ssrf_violation_repo.py). The protocol docstring should be updated to document this behavior, but that's outside this file's scope.


142-143: LGTM!

The new repository instances are properly instantiated and exposed through typed properties, addressing the previous review comment about returning proper repository types.

Also applies to: 272-278

src/synthorg/persistence/sqlite/schema.sql (2)

501-524: LGTM!

The risk_overrides table correctly implements the CHECK constraint ensuring revoked_at and revoked_by are either both NULL or both NOT NULL, as suggested in the previous review. The partial index idx_ro_active on (expires_at, created_at DESC) WHERE revoked_at IS NULL efficiently supports list_active() queries.


526-552: LGTM!

The ssrf_violations table correctly implements the status/resolution consistency CHECK constraint ensuring that pending status has NULL resolution fields and allowed/denied status has both resolution fields populated. The composite index idx_sv_status_timestamp efficiently supports the list_violations() query pattern with status filtering and timestamp ordering.

src/synthorg/persistence/sqlite/ssrf_violation_repo.py (5)

31-43: LGTM!

The helper functions are well-focused: _is_unique_constraint_error() correctly checks sqlite_errorname for constraint violations, and _ensure_utc() safely handles naive datetimes from SQLite.


46-63: LGTM!

The class correctly accepts an optional shared write lock with a sensible per-instance default for test ergonomics. The docstring properly documents the parameters.


65-74: LGTM!

The rollback helper correctly logs failures with exc_info=True before suppressing, which is appropriate for cleanup operations where the primary error is already being handled.


206-216: LGTM!

The deserialization loop correctly uses PEP 758 syntax (except ValueError, ValidationError:) and gracefully skips malformed rows with a warning, which is appropriate for list operations where partial results are acceptable.


264-294: LGTM!

The _row_to_violation() helper correctly destructures the SQLite row, handles nullable fields with fallback to None, and uses _ensure_utc() for timezone-aware parsing.

Comment on lines +124 to +131
except (sqlite3.Error, aiosqlite.Error) as exc:
await self._rollback_quietly()
msg = f"Failed to save SSRF violation: {exc}"
logger.exception(
PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

The coding guidelines require except A, B: syntax (no parentheses) for Python 3.14 with PEP 758.

🔧 Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use PEP 758 except syntax: except A, B: (no parentheses)—ruff enforces this on Python 3.14."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 124 -
131, The except clause in the SSRF violation save routine uses tuple-style
exceptions; change the clause to PEP 758 form so it reads "except sqlite3.Error,
aiosqlite.Error as exc:" (removing the parentheses and comma inside them) while
keeping the body intact (call self._rollback_quietly(), build msg,
logger.exception with PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED, and raise
PersistenceError from exc); update any other similar except clauses in the same
module (e.g., in methods of the SSRFViolationRepo class or nearby functions) to
the same PEP 758 syntax.

Comment on lines +150 to +156
except (sqlite3.Error, aiosqlite.Error) as exc:
msg = f"Failed to get SSRF violation: {exc}"
logger.exception(
PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

🔧 Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use PEP 758 except syntax: except A, B: (no parentheses)—ruff enforces this on Python 3.14."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 150 -
156, The except clause in ssrf_violation_repo.py currently uses parentheses for
multiple exceptions; change the clause that now says "except (sqlite3.Error,
aiosqlite.Error) as exc" to the PEP 758 form without parentheses by listing
sqlite3.Error and aiosqlite.Error without surrounding parentheses while keeping
the exception binding as exc so the existing
logger.exception(PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED, error=msg) and raise
PersistenceError(msg) from exc behavior remains unchanged.

Comment on lines +162 to +169
except (ValueError, ValidationError) as exc:
msg = f"Failed to deserialize SSRF violation {violation_id!r}: {exc}"
logger.exception(
PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED,
error=msg,
violation_id=violation_id,
)
raise PersistenceError(msg) from exc
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

🔧 Proposed fix
-        except (ValueError, ValidationError) as exc:
+        except ValueError, ValidationError as exc:

As per coding guidelines: "Use PEP 758 except syntax: except A, B: (no parentheses)—ruff enforces this on Python 3.14."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 162 -
169, Replace the current tuple-style except clause "except (ValueError,
ValidationError) as exc:" with the PEP 758 style "except ValueError,
ValidationError as exc:" in the SSRF deserialization error handling block (the
block that logs PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED and raises
PersistenceError using violation_id and msg); keep the logger.exception call and
the subsequent "raise PersistenceError(msg) from exc" unchanged.

Comment on lines +198 to +204
except (sqlite3.Error, aiosqlite.Error) as exc:
msg = f"Failed to list SSRF violations: {exc}"
logger.exception(
PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

🔧 Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use PEP 758 except syntax: except A, B: (no parentheses)—ruff enforces this on Python 3.14."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except (sqlite3.Error, aiosqlite.Error) as exc:
msg = f"Failed to list SSRF violations: {exc}"
logger.exception(
PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
except sqlite3.Error, aiosqlite.Error as exc:
msg = f"Failed to list SSRF violations: {exc}"
logger.exception(
PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 198 -
204, Replace the parenthesized except tuple with PEP 758 syntax in the SSRF
violation DB error handler: change the clause currently written as "except
(sqlite3.Error, aiosqlite.Error) as exc:" to the PEP 758 form "except
sqlite3.Error, aiosqlite.Error as exc:" in the block that logs
PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED and raises PersistenceError so the
exception is captured in exc without parentheses.

Comment on lines +252 to +259
except (sqlite3.Error, aiosqlite.Error) as exc:
await self._rollback_quietly()
msg = f"Failed to update SSRF violation status: {exc}"
logger.exception(
PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

🔧 Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use PEP 758 except syntax: except A, B: (no parentheses)—ruff enforces this on Python 3.14."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except (sqlite3.Error, aiosqlite.Error) as exc:
await self._rollback_quietly()
msg = f"Failed to update SSRF violation status: {exc}"
logger.exception(
PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
except sqlite3.Error, aiosqlite.Error as exc:
await self._rollback_quietly()
msg = f"Failed to update SSRF violation status: {exc}"
logger.exception(
PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 252 -
259, Replace the tuple-style except clause with PEP 758 comma-separated syntax:
change the line "except (sqlite3.Error, aiosqlite.Error) as exc:" to "except
sqlite3.Error, aiosqlite.Error as exc:" in the SSRF violation update block so
the rollback call (self._rollback_quietly()),
logger.exception(PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED, error=msg), and raise
PersistenceError(msg) from exc remain unchanged.

Comment on lines +78 to +81
try:
setting = await self._settings_service.get(namespace, key)
raw = setting.value if setting is not None else "[]"
entries = json.loads(raw)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unreachable fallback—get() raises on missing key, never returns None.

Per SettingsService.get() (context snippet 1), the method raises SettingNotFoundError when the key isn't registered—it never returns None. The else "[]" fallback on line 80 is dead code, and SettingNotFoundError will propagate unhandled if the setting isn't registered.

If fail-closed is intended, catch SettingNotFoundError explicitly and log it. Otherwise, simplify to raw = setting.value.

🛡️ Option A: Fail-closed with explicit handling
+        from synthorg.settings.errors import SettingNotFoundError
+
         try:
             setting = await self._settings_service.get(namespace, key)
-            raw = setting.value if setting is not None else "[]"
+            raw = setting.value
             entries = json.loads(raw)
+        except SettingNotFoundError:
+            logger.warning(
+                SECURITY_ALLOWLIST_UPDATE_FAILED,
+                namespace=namespace,
+                key=key,
+                error="setting not registered",
+            )
+            return
-        except (json.JSONDecodeError, TypeError) as exc:
+        except json.JSONDecodeError, TypeError as exc:
✏️ Option B: Simplify if setting is guaranteed registered
         try:
             setting = await self._settings_service.get(namespace, key)
-            raw = setting.value if setting is not None else "[]"
+            raw = setting.value
             entries = json.loads(raw)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/settings/subscribers/security_subscriber.py` around lines 78 -
81, The try block assumes SettingsService.get() may return None but it actually
raises SettingNotFoundError on missing keys; remove the dead fallback or handle
the exception explicitly: either simplify by assigning raw = setting.value after
await self._settings_service.get(namespace, key) if the setting is guaranteed to
exist, or catch SettingNotFoundError around the await (referencing
SettingsService.get and SettingNotFoundError) and log/handle the error and set
raw = "[]" before calling json.loads(raw) so entries is computed safely; update
the block around _settings_service.get, raw, and entries accordingly.

Comment on lines +101 to +108
except (json.JSONDecodeError, TypeError) as exc:
logger.warning(
SECURITY_ALLOWLIST_UPDATE_FAILED,
namespace=namespace,
key=key,
error=f"failed to parse allowlist: {exc}",
)
return
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

The parenthesized form except (json.JSONDecodeError, TypeError) violates the project's PEP 758 convention. Python 3.14 uses comma-separated exception types without parentheses.

✏️ Proposed fix
-        except (json.JSONDecodeError, TypeError) as exc:
+        except json.JSONDecodeError, TypeError as exc:

As per coding guidelines: "Use PEP 758 except syntax: except A, B: (no parentheses)—ruff enforces this on Python 3.14."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except (json.JSONDecodeError, TypeError) as exc:
logger.warning(
SECURITY_ALLOWLIST_UPDATE_FAILED,
namespace=namespace,
key=key,
error=f"failed to parse allowlist: {exc}",
)
return
except json.JSONDecodeError, TypeError as exc:
logger.warning(
SECURITY_ALLOWLIST_UPDATE_FAILED,
namespace=namespace,
key=key,
error=f"failed to parse allowlist: {exc}",
)
return
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/settings/subscribers/security_subscriber.py` around lines 101 -
108, Replace the parenthesized exception clause in the try/except handling for
parsing the allowlist: change "except (json.JSONDecodeError, TypeError) as exc:"
to the PEP 758 form "except json.JSONDecodeError, TypeError as exc:" in the
block that logs via logger.warning with SECURITY_ALLOWLIST_UPDATE_FAILED
(keeping namespace, key and error=f"failed to parse allowlist: {exc}"
unchanged).

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/tools/sandbox/test_runtime_resolver.py`:
- Around line 93-125: The tests currently pass explicit runtime_overrides into
DockerSandboxConfig so they don't exercise merge_gvisor_defaults; update the
tests to construct the config via the factory/default-merging path or call
merge_gvisor_defaults() directly (on the factory or config) to produce the
DockerSandboxConfig used by SandboxRuntimeResolver, then assert
resolver.resolve_runtime("code_execution") and
resolver.resolve_runtime("terminal") reflect the merged defaults; reference
DockerSandboxConfig, merge_gvisor_defaults, SandboxRuntimeResolver, and
resolve_runtime when locating where to change the test setup.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e33ff773-2807-4fea-9868-10996a2e7255

📥 Commits

Reviewing files that changed from the base of the PR and between 2b70cb4 and 94bbb2f.

📒 Files selected for processing (2)
  • tests/unit/api/fakes_backend.py
  • tests/unit/tools/sandbox/test_runtime_resolver.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Build Sandbox
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (go)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: except A, B: (no parentheses)—ruff enforces this on Python 3.14.
Type hints: all public functions required, mypy strict mode enforced.
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules).
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens).
Use NotBlankStr (from core.types) for all identifier/name fields—including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants—instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff enforced).
Functions: < 50 lines, files < 800 lines.
Handle errors explicitly, never silently swallow.
Validate at system boundaries (user input, external APIs, config files).
Variable name: always logger (not _logger, not log).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO.
DEBUG for object creation, internal flow, entry/exit of key functions.
Pure data models, enums, and re-exports do NOT need logging.

Files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
  • tests/unit/api/fakes_backend.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow.
Async: asyncio_mode = "auto"—no manual @pytest.mark.asyncio needed.
Timeout: 30 seconds per test (global in pyproject.toml—do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed).
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases.
Tests must use test-provider, test-small-001, etc. (not real vendor names).
Property-based testing: Python uses Hypothesis (@given + @settings).
Hypothesis profiles configured in tests/conftest.py: ci (deterministic, max_examples=10 + derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline). Controlled via HYPOTHESIS_PROFILE env var.
Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
  • tests/unit/api/fakes_backend.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
  • tests/unit/api/fakes_backend.py
🧠 Learnings (15)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Security scanning: gitleaks (push/PR + weekly), zizmor (workflow analysis), OSSF Scorecard (weekly), Socket.dev (PR supply chain), ZAP DAST (weekly + manual).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-06T16:35:12.934Z
Learning: Agents in the Parallel tier (Haiku-class) include 14 reviewers: async-concurrency-reviewer, comment-analyzer, design-token-audit, docs-consistency, go-conventions-enforcer, go-security-reviewer, infra-reviewer, issue-resolution-verifier, persistence-reviewer, pr-test-analyzer, security-reviewer, silent-failure-hunter, test-quality-reviewer, and type-design-analyzer
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T21:32:02.880Z
Learning: Applies to .github/workflows/docker.yml : Docker workflow: builds backend + web + sandbox images, pushes to GHCR, signs with cosign. SLSA L3 provenance attestations via actions/attest-build-provenance. Scans: Trivy (CRITICAL = hard fail, HIGH = warn) + Grype (critical cutoff) + CIS Docker Benchmark v1.6.0 compliance (informational). CVE triage via .github/.trivyignore.yaml and .github/.grype.yaml. Images only pushed after scans pass. Triggers on push to main and version tags (v*).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Dependabot: daily updates (uv, github-actions, npm, pre-commit, docker, gomod), all updates grouped into 1 PR per ecosystem, no auto-merge. Use /review-dep-pr before merging.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to tests/**/*.py : Never skip, dismiss, or ignore flaky tests; fix them fully and fundamentally by mocking `time.monotonic()` and `asyncio.sleep()` for determinism or using `asyncio.Event().wait()` for indefinite blocking

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-04-06T20:42:41.219Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Applies to tests/**/*.py : Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mock `time.monotonic()` and `asyncio.sleep()` for determinism; use `asyncio.Event().wait()` for indefinite blocking instead of `asyncio.sleep(large_number)`

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to tests/**/*.py : Flaky tests: NEVER skip, dismiss, or ignore flaky tests -- always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docker/Dockerfile.sandbox : Docker sandbox: `synthorg-sandbox` — Python 3.14 + Node.js + git, non-root (UID 10001), agent code execution sandbox

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : For non-Pydantic internal collections (registries, `BaseTool`), use `copy.deepcopy()` at construction and wrap with `MappingProxyType` for read-only enforcement

Applied to files:

  • tests/unit/api/fakes_backend.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)

Applied to files:

  • tests/unit/api/fakes_backend.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/persistence/**/*.py : Persistence uses pluggable PersistenceBackend protocol. SQLite is the initial backend. Settings use SettingsRepository (namespaced settings CRUD).

Applied to files:

  • tests/unit/api/fakes_backend.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization) for `dict`/`list` fields

Applied to files:

  • tests/unit/api/fakes_backend.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/api/**/*.py : REST API: Litestar framework, controllers with guards, channels for WebSocket, JWT + API key + WS ticket auth, approval gate integration, coordination endpoint, collaboration endpoint, settings endpoint. RFC 9457 structured errors (ErrorCategory, ErrorCode, ErrorDetail, ProblemDetail, CATEGORY_TITLES, category_title, category_type_uri, content negotiation).

Applied to files:

  • tests/unit/api/fakes_backend.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 with adopted conventions: use computed_field for derived values instead of storing + validating redundant fields; use NotBlankStr from core.types for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators.

Applied to files:

  • tests/unit/api/fakes_backend.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.

Applied to files:

  • tests/unit/api/fakes_backend.py
🔇 Additional comments (4)
tests/unit/api/fakes_backend.py (3)

55-60: Good protocol-aligned duplicate protection in fake repositories.

Both save() implementations now reject duplicate IDs with DuplicateRecordError, which keeps test fakes behaviorally consistent with persistence contracts.

Also applies to: 92-97


115-136: SSRF transition guard is correctly enforced in the fake.

update_status() now blocks transitions back to PENDING and only resolves currently pending records, which is the right behavior for avoiding false-positive test passes.


149-150: Backend surface extension looks correct.

Wiring and exposing risk_overrides and ssrf_violations repositories on FakePersistenceBackend is clean and consistent with the new persistence domains.

Also applies to: 279-285

tests/unit/tools/sandbox/test_runtime_resolver.py (1)

127-184: Probe tests are deterministic and validate real fallback paths

Good coverage here: mocking aiodocker.Docker (not the resolver method itself) verifies success parsing, missing-key fallback, and exception fallback behavior in probe_available_runtimes().

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
tests/unit/tools/sandbox/test_runtime_resolver.py (1)

96-125: ⚠️ Potential issue | 🟡 Minor

Factory-default tests still bypass merge_gvisor_defaults() path

These tests are labeled around factory-default behavior, but both construct DockerSandboxConfig with explicit runtime_overrides, so regressions in merge_gvisor_defaults() won’t be caught. Build config through the merge path and then assert resolver outcomes.

♻️ Proposed test setup adjustment
+from synthorg.tools.sandbox.factory import merge_gvisor_defaults
...
-        config = DockerSandboxConfig(
-            runtime_overrides={
-                "code_execution": "runsc",
-                "terminal": "runsc",
-            },
-        )
+        config = merge_gvisor_defaults(DockerSandboxConfig(runtime_overrides={}))
...
-        config = DockerSandboxConfig(
-            runtime_overrides={
-                "code_execution": "runc",
-                "terminal": "runc",
-            },
-        )
+        base = merge_gvisor_defaults(DockerSandboxConfig(runtime_overrides={}))
+        config = DockerSandboxConfig(
+            runtime=base.runtime,
+            runtime_overrides={"code_execution": "runc", "terminal": "runc"},
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/tools/sandbox/test_runtime_resolver.py` around lines 96 - 125,
Tests labeled to exercise factory-default behavior currently pass explicit
runtime_overrides to DockerSandboxConfig so they bypass the
merge_gvisor_defaults() logic; change the test setup to construct the config via
the merge_gvisor_defaults(...) call (or call the helper that returns the merged
DockerSandboxConfig) and then pass that resulting config into
SandboxRuntimeResolver (used with available_runtimes and resolve_runtime) before
asserting the expected runtimes, ensuring the merge_gvisor_defaults path is
actually exercised for "code_execution" and "terminal".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@tests/unit/tools/sandbox/test_runtime_resolver.py`:
- Around line 96-125: Tests labeled to exercise factory-default behavior
currently pass explicit runtime_overrides to DockerSandboxConfig so they bypass
the merge_gvisor_defaults() logic; change the test setup to construct the config
via the merge_gvisor_defaults(...) call (or call the helper that returns the
merged DockerSandboxConfig) and then pass that resulting config into
SandboxRuntimeResolver (used with available_runtimes and resolve_runtime) before
asserting the expected runtimes, ensuring the merge_gvisor_defaults path is
actually exercised for "code_execution" and "terminal".

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a8a107dc-4ec7-449f-8f95-764e2076df3b

📥 Commits

Reviewing files that changed from the base of the PR and between 94bbb2f and bad9a43.

📒 Files selected for processing (1)
  • tests/unit/tools/sandbox/test_runtime_resolver.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Deploy Preview
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: except A, B: (no parentheses)—ruff enforces this on Python 3.14.
Type hints: all public functions required, mypy strict mode enforced.
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules).
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens).
Use NotBlankStr (from core.types) for all identifier/name fields—including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants—instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (ruff enforced).
Functions: < 50 lines, files < 800 lines.
Handle errors explicitly, never silently swallow.
Validate at system boundaries (user input, external APIs, config files).
Variable name: always logger (not _logger, not log).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO.
DEBUG for object creation, internal flow, entry/exit of key functions.
Pure data models, enums, and re-exports do NOT need logging.

Files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow.
Async: asyncio_mode = "auto"—no manual @pytest.mark.asyncio needed.
Timeout: 30 seconds per test (global in pyproject.toml—do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed).
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases.
Tests must use test-provider, test-small-001, etc. (not real vendor names).
Property-based testing: Python uses Hypothesis (@given + @settings).
Hypothesis profiles configured in tests/conftest.py: ci (deterministic, max_examples=10 + derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline). Controlled via HYPOTHESIS_PROFILE env var.
Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
🧠 Learnings (8)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Security scanning: gitleaks (push/PR + weekly), zizmor (workflow analysis), OSSF Scorecard (weekly), Socket.dev (PR supply chain), ZAP DAST (weekly + manual).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-06T16:35:12.934Z
Learning: Agents in the Parallel tier (Haiku-class) include 14 reviewers: async-concurrency-reviewer, comment-analyzer, design-token-audit, docs-consistency, go-conventions-enforcer, go-security-reviewer, infra-reviewer, issue-resolution-verifier, persistence-reviewer, pr-test-analyzer, security-reviewer, silent-failure-hunter, test-quality-reviewer, and type-design-analyzer
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T21:32:02.880Z
Learning: Applies to .github/workflows/docker.yml : Docker workflow: builds backend + web + sandbox images, pushes to GHCR, signs with cosign. SLSA L3 provenance attestations via actions/attest-build-provenance. Scans: Trivy (CRITICAL = hard fail, HIGH = warn) + Grype (critical cutoff) + CIS Docker Benchmark v1.6.0 compliance (informational). CVE triage via .github/.trivyignore.yaml and .github/.grype.yaml. Images only pushed after scans pass. Triggers on push to main and version tags (v*).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Dependabot: daily updates (uv, github-actions, npm, pre-commit, docker, gomod), all updates grouped into 1 PR per ecosystem, no auto-merge. Use /review-dep-pr before merging.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T05:46:17.064Z
Learning: Fix all valid issues found by review agents (including pre-existing issues in surrounding code, suggestions, and adjacent findings) — never skip or defer
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to tests/**/*.py : Never skip, dismiss, or ignore flaky tests; fix them fully and fundamentally by mocking `time.monotonic()` and `asyncio.sleep()` for determinism or using `asyncio.Event().wait()` for indefinite blocking

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-04-06T20:42:41.219Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T20:42:41.219Z
Learning: Applies to tests/**/*.py : Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mock `time.monotonic()` and `asyncio.sleep()` for determinism; use `asyncio.Event().wait()` for indefinite blocking instead of `asyncio.sleep(large_number)`

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to tests/**/*.py : Flaky tests: NEVER skip, dismiss, or ignore flaky tests -- always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docker/Dockerfile.sandbox : Docker sandbox: `synthorg-sandbox` — Python 3.14 + Node.js + git, non-root (UID 10001), agent code execution sandbox

Applied to files:

  • tests/unit/tools/sandbox/test_runtime_resolver.py

@Aureliolo
Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 7, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/synthorg/tools/sandbox/docker_sandbox.py (1)

218-238: ⚠️ Potential issue | 🟠 Major

Per-category runtime overrides are currently inert in the execute path.

category is introduced on Line 218 and used by _resolve_runtime on Line 312, but the current execution flow never supplies it (the _run_container call to _build_container_config at Line 461 uses defaults). That leaves category="", so _resolve_runtime falls back to self._config.runtime (Line 314), and category-specific runtime_overrides (including injected gVisor defaults) are not applied.

🔧 Proposed fix (plumb category through execution flow)
 async def execute(
     self,
     *,
     command: str,
     args: tuple[str, ...],
+    category: str = "",
     cwd: Path | None = None,
     env_overrides: Mapping[str, str] | None = None,
     timeout: float | None = None,  # noqa: ASYNC109
 ) -> SandboxResult:
@@
     return await self._run_container(
         docker=docker,
         command=command,
         args=args,
+        category=category,
         container_cwd=container_cwd,
         env_overrides=env_overrides,
         timeout=effective_timeout,
     )

 async def _run_container(  # noqa: PLR0913
     self,
     *,
     docker: aiodocker.Docker,
     command: str,
     args: tuple[str, ...],
+    category: str = "",
     container_cwd: str,
     env_overrides: Mapping[str, str] | None,
     timeout: float,  # noqa: ASYNC109
 ) -> SandboxResult:
@@
     config = self._build_container_config(
         command=command,
         args=args,
+        category=category,
         container_cwd=container_cwd,
         env_overrides=env_overrides,
     )

Also applies to: 306-315

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/tools/sandbox/docker_sandbox.py` around lines 218 - 238, The
category parameter is never propagated into the execution path so per-category
runtime_overrides in _resolve_runtime are not applied; update the call chain to
pass category through: ensure callers that initiate container runs (e.g., the
method that calls _run_container) forward the category string into
_run_container, update _run_container to accept and forward category into
_build_container_config, and update _build_container_config to pass that
category into _resolve_runtime; reference the methods _run_container,
_build_container_config, and _resolve_runtime and make sure default
empty-category usage is removed so category-specific runtime_overrides are
applied.
♻️ Duplicate comments (8)
src/synthorg/security/rules/risk_override.py (1)

29-29: ⚠️ Potential issue | 🔴 Critical

NotBlankStr is a type alias, not callable—line 29 will fail at import.

NotBlankStr is an Annotated type from core.types, not a class constructor. Calling NotBlankStr("system") raises TypeError at module import time.

-_DEFAULT_REVOKED_BY = NotBlankStr("system")
+_DEFAULT_REVOKED_BY = "system"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/security/rules/risk_override.py` at line 29, The constant
_DEFAULT_REVOKED_BY incorrectly calls NotBlankStr as a constructor; replace the
call with a plain string value and express the type via annotation or an
explicit cast so the module import won't raise a TypeError—i.e., treat
NotBlankStr as an Annotated type (from core.types) and set _DEFAULT_REVOKED_BY
to the string "system" while annotating it as NotBlankStr or use
typing.cast(NotBlankStr, "system") to satisfy the type system.
src/synthorg/persistence/sqlite/ssrf_violation_repo.py (5)

198-204: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) for exception handling in Python 3.14 per PEP 758."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 198 -
204, The except clause currently uses parentheses for the exception tuple;
update the exception handling in the SSRF listing block to use PEP 758
parentheses-free syntax so the clause reads except sqlite3.Error,
aiosqlite.Error as exc (keeping the same body that logs via logger.exception
with PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED and raises PersistenceError(msg)
from exc). Locate the block containing logger.exception,
PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED and the raise of PersistenceError to
make this single-line change.

252-259: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) for exception handling in Python 3.14 per PEP 758."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 252 -
259, The except block currently uses parentheses for multi-exception catching
("except (sqlite3.Error, aiosqlite.Error) as exc:"); change it to the PEP 758
style without parentheses to catch sqlite3.Error and aiosqlite.Error (i.e., use
a comma-separated exception list) so the handler still calls
self._rollback_quietly(), logs via
logger.exception(PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED, error=msg), and raises
PersistenceError(msg) from exc; leave remaining logic (msg formation, rollback,
logging, raise) unchanged.

162-169: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (ValueError, ValidationError) as exc:
+        except ValueError, ValidationError as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) for exception handling in Python 3.14 per PEP 758."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 162 -
169, Change the except clause to use the PEP 758 comma form instead of
parentheses: replace the current "except (ValueError, ValidationError) as exc:"
with the PEP‑758 style that catches both exceptions in one clause (so the
handler still binds the exception to exc); keep the body intact (the
logger.exception call referencing PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED,
error=msg, violation_id=violation_id and the subsequent raise
PersistenceError(msg) from exc) so only the except header is modified to the new
syntax.

150-156: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) for exception handling in Python 3.14 per PEP 758."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 150 -
156, The except block currently uses tuple-style parentheses for multiple
exceptions; update it to use the PEP 758 comma form so both exceptions are
caught without parentheses. Locate the except handling that references
PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED and raises PersistenceError (the block
with "except (sqlite3.Error, aiosqlite.Error) as exc:"), and change it to the
PEP‑758 form using a comma between exception types (keeping the same variable
name exc), leaving the logger.exception call and the raise PersistenceError(msg)
from exc unchanged.

124-131: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) for exception handling in Python 3.14 per PEP 758."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 124 -
131, Change the except clause to use PEP 758 comma-separated syntax instead of
parentheses: replace "except (sqlite3.Error, aiosqlite.Error) as exc:" with
"except sqlite3.Error, aiosqlite.Error as exc:" in the block that calls
self._rollback_quietly(), logs PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED via
logger.exception, and raises PersistenceError; keep the existing error handling,
message construction, and "raise ... from exc" semantics unchanged.
src/synthorg/settings/subscribers/security_subscriber.py (2)

101-108: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

The parenthesized form violates the project's PEP 758 convention for Python 3.14.

✏️ Proposed fix
-        except (json.JSONDecodeError, TypeError) as exc:
+        except json.JSONDecodeError, TypeError as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) for exception handling in Python 3.14 per PEP 758."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/settings/subscribers/security_subscriber.py` around lines 101 -
108, Replace the parenthesized exception tuple in the except clause handling
JSON decode/type errors with the PEP 758 style: change the clause that currently
reads except (json.JSONDecodeError, TypeError) as exc to the non-parenthesized
form except json.JSONDecodeError, TypeError as exc so the logger call using
SECURITY_ALLOWLIST_UPDATE_FAILED, namespace and key remains unchanged and the
variable exc still holds the exception.

78-81: ⚠️ Potential issue | 🟡 Minor

Remove unreachable fallback and handle SettingNotFoundError explicitly.

Per SettingsService.get(), the method raises SettingNotFoundError when the key is not registered—it never returns None. The fallback else "[]" on line 80 is dead code, and SettingNotFoundError will propagate unhandled. Catch and log it explicitly per coding guidelines.

Suggested fix
+        from synthorg.settings.errors import SettingNotFoundError
+
         try:
             setting = await self._settings_service.get(namespace, key)
-            raw = setting.value if setting is not None else "[]"
+            raw = setting.value
             entries = json.loads(raw)
             if not isinstance(entries, list):
                 logger.warning(
                     SECURITY_ALLOWLIST_UPDATE_FAILED,
                     namespace=namespace,
                     key=key,
                     error="expected JSON array",
                 )
                 return
             if any(
                 not isinstance(entry, str) or not entry.strip() for entry in entries
             ):
                 logger.warning(
                     SECURITY_ALLOWLIST_UPDATE_FAILED,
                     namespace=namespace,
                     key=key,
                     error="allowlist entries must be non-empty strings",
                 )
                 return
             allowlist = tuple(entry.strip() for entry in entries)
+        except SettingNotFoundError:
+            logger.warning(
+                SECURITY_ALLOWLIST_UPDATE_FAILED,
+                namespace=namespace,
+                key=key,
+                error="setting not registered",
+            )
+            return
-        except (json.JSONDecodeError, TypeError) as exc:
+        except json.JSONDecodeError, TypeError as exc:
             logger.warning(
                 SECURITY_ALLOWLIST_UPDATE_FAILED,
                 namespace=namespace,
                 key=key,
                 error=f"failed to parse allowlist: {exc}",
             )
             return
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/settings/subscribers/security_subscriber.py` around lines 78 -
81, The code treats SettingsService.get(namespace, key) as if it can return None
and provides a dead fallback "[]"; instead catch SettingNotFoundError from
SettingsService.get explicitly, remove the else "[]" fallback, and handle the
missing setting by logging the exception and using an empty list for entries.
Concretely, wrap the await self._settings_service.get(namespace, key) call in a
try/except catching SettingNotFoundError, call the subscriber's logger (e.g.
self._logger.error or .exception) with a clear message including namespace and
key, and set entries = [] in that except block; keep the existing
json.loads(raw) path for the normal case where setting is returned.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/persistence/sqlite/risk_override_repo.py`:
- Around line 126-133: Change the multi-exception except clause to use PEP 758
comma syntax (no parentheses): replace "except (sqlite3.Error, aiosqlite.Error)
as exc:" with the comma-separated form so the handler that calls
self._rollback_quietly(), logs via
logger.exception(PERSISTENCE_RISK_OVERRIDE_SAVE_FAILED, error=msg) and re-raises
PersistenceError(msg) from exc still runs for both sqlite3.Error and
aiosqlite.Error; keep the existing rollback, logging
(PERSISTENCE_RISK_OVERRIDE_SAVE_FAILED) and raise behavior unchanged.
- Around line 213-220: Replace the parenthesized multi-except with the PEP 758
style: change the exception clause in the block that calls
self._rollback_quietly() and raises PersistenceError so it uses "except
sqlite3.Error, aiosqlite.Error as exc:" (referencing the same
logger.exception(PERSISTENCE_RISK_OVERRIDE_SAVE_FAILED, ...), msg creation, and
raise PersistenceError(msg) from exc) to comply with the guideline; keep the
existing rollback, logging (including msg and error=msg), and re-raise behavior
unchanged.
- Around line 175-181: Change the multi-exception tuple syntax in the except
clause to the PEP 758 form: replace the current "except (sqlite3.Error,
aiosqlite.Error) as exc" with the PEP 758 style "except sqlite3.Error,
aiosqlite.Error as exc" in the error handling block inside the function that
lists active overrides (the block that logs
PERSISTENCE_RISK_OVERRIDE_QUERY_FAILED and raises PersistenceError); keep the
logger.exception and raise logic unchanged.
- Around line 152-158: Change the except clause that currently reads except
(sqlite3.Error, aiosqlite.Error) as exc to use the PEP 758 style without
parentheses (i.e., except sqlite3.Error, aiosqlite.Error as exc) so the handler
in the block that logs PERSISTENCE_RISK_OVERRIDE_QUERY_FAILED via
logger.exception and raises PersistenceError(msg) from exc uses the required
syntax; leave the body (msg creation, logger.exception call, and raise
PersistenceError) unchanged.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py`:
- Around line 218-261: The protocol's docstring for the SSRF violation
repository should explicitly state the "only-if-pending" update invariant to
match update_status's behavior: update_status(violation_id, *, status,
resolved_by, resolved_at) only changes a row when the stored status is 'pending'
(the implementation uses "WHERE id = ? AND status = 'pending'") and therefore
returns True only if a pending violation was found and transitioned; update the
protocol (the docstring on the repository protocol/class for SsrfViolationRepo)
to document this precondition and the exact return semantics.

In `@src/synthorg/tools/sandbox/docker_config.py`:
- Around line 87-94: The declared Field policy: SandboxPolicy | None is never
applied to override the flat config fields — add a pydantic validator
(preferably a `@root_validator`(pre=True) or `@model_validator`(mode="before")
depending on pydantic version) on the DockerConfig (or the containing pydantic
model) that, when policy is not None, merges policy.domain dicts (filesystem,
network, process, inference) into the incoming values so policy values override
the flat fields; implement safe shallow-merge logic (skip None domains), convert
nested SandboxPolicy domains to their dicts, and return the merged mapping so
the model fields are populated with policy-overrides during validation.

In `@tests/unit/api/fakes_backend.py`:
- Around line 70-83: The fake backend's revoke method uses ovr.is_active to
decide whether to block revocation, but the real SQLite repo only blocks when
revoked_at is not NULL; update the revoke method in
tests/unit/api/fakes_backend.py (function revoke, variable ovr and
self._overrides) to check ovr.revoked_at is None as the predicate (return False
if revoked_at is already set) and only then call model_copy to set revoked_at
and revoked_by so the fake mirrors the production mutation guard.

In `@tests/unit/tools/sandbox/test_credential_manager.py`:
- Around line 69-75: The test test_case_insensitive_matching in
tests/unit/tools/sandbox/test_credential_manager.py uses a vendor-specific key
"openai_api_key"; update the test to use a vendor-agnostic key (e.g.,
"test_provider_api_key" or "test-provider_api_key") in the env dict so
sanitize_env (SandboxCredentialManager.sanitize_env) still validates
case-insensitive removal without referencing real vendor names; keep the second
key "Api_Key" to test case variation and ensure assertions still check both keys
are removed from result.

---

Outside diff comments:
In `@src/synthorg/tools/sandbox/docker_sandbox.py`:
- Around line 218-238: The category parameter is never propagated into the
execution path so per-category runtime_overrides in _resolve_runtime are not
applied; update the call chain to pass category through: ensure callers that
initiate container runs (e.g., the method that calls _run_container) forward the
category string into _run_container, update _run_container to accept and forward
category into _build_container_config, and update _build_container_config to
pass that category into _resolve_runtime; reference the methods _run_container,
_build_container_config, and _resolve_runtime and make sure default
empty-category usage is removed so category-specific runtime_overrides are
applied.

---

Duplicate comments:
In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py`:
- Around line 198-204: The except clause currently uses parentheses for the
exception tuple; update the exception handling in the SSRF listing block to use
PEP 758 parentheses-free syntax so the clause reads except sqlite3.Error,
aiosqlite.Error as exc (keeping the same body that logs via logger.exception
with PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED and raises PersistenceError(msg)
from exc). Locate the block containing logger.exception,
PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED and the raise of PersistenceError to
make this single-line change.
- Around line 252-259: The except block currently uses parentheses for
multi-exception catching ("except (sqlite3.Error, aiosqlite.Error) as exc:");
change it to the PEP 758 style without parentheses to catch sqlite3.Error and
aiosqlite.Error (i.e., use a comma-separated exception list) so the handler
still calls self._rollback_quietly(), logs via
logger.exception(PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED, error=msg), and raises
PersistenceError(msg) from exc; leave remaining logic (msg formation, rollback,
logging, raise) unchanged.
- Around line 162-169: Change the except clause to use the PEP 758 comma form
instead of parentheses: replace the current "except (ValueError,
ValidationError) as exc:" with the PEP‑758 style that catches both exceptions in
one clause (so the handler still binds the exception to exc); keep the body
intact (the logger.exception call referencing
PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED, error=msg, violation_id=violation_id
and the subsequent raise PersistenceError(msg) from exc) so only the except
header is modified to the new syntax.
- Around line 150-156: The except block currently uses tuple-style parentheses
for multiple exceptions; update it to use the PEP 758 comma form so both
exceptions are caught without parentheses. Locate the except handling that
references PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED and raises PersistenceError
(the block with "except (sqlite3.Error, aiosqlite.Error) as exc:"), and change
it to the PEP‑758 form using a comma between exception types (keeping the same
variable name exc), leaving the logger.exception call and the raise
PersistenceError(msg) from exc unchanged.
- Around line 124-131: Change the except clause to use PEP 758 comma-separated
syntax instead of parentheses: replace "except (sqlite3.Error, aiosqlite.Error)
as exc:" with "except sqlite3.Error, aiosqlite.Error as exc:" in the block that
calls self._rollback_quietly(), logs PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED via
logger.exception, and raises PersistenceError; keep the existing error handling,
message construction, and "raise ... from exc" semantics unchanged.

In `@src/synthorg/security/rules/risk_override.py`:
- Line 29: The constant _DEFAULT_REVOKED_BY incorrectly calls NotBlankStr as a
constructor; replace the call with a plain string value and express the type via
annotation or an explicit cast so the module import won't raise a
TypeError—i.e., treat NotBlankStr as an Annotated type (from core.types) and set
_DEFAULT_REVOKED_BY to the string "system" while annotating it as NotBlankStr or
use typing.cast(NotBlankStr, "system") to satisfy the type system.

In `@src/synthorg/settings/subscribers/security_subscriber.py`:
- Around line 101-108: Replace the parenthesized exception tuple in the except
clause handling JSON decode/type errors with the PEP 758 style: change the
clause that currently reads except (json.JSONDecodeError, TypeError) as exc to
the non-parenthesized form except json.JSONDecodeError, TypeError as exc so the
logger call using SECURITY_ALLOWLIST_UPDATE_FAILED, namespace and key remains
unchanged and the variable exc still holds the exception.
- Around line 78-81: The code treats SettingsService.get(namespace, key) as if
it can return None and provides a dead fallback "[]"; instead catch
SettingNotFoundError from SettingsService.get explicitly, remove the else "[]"
fallback, and handle the missing setting by logging the exception and using an
empty list for entries. Concretely, wrap the await
self._settings_service.get(namespace, key) call in a try/except catching
SettingNotFoundError, call the subscriber's logger (e.g. self._logger.error or
.exception) with a clear message including namespace and key, and set entries =
[] in that except block; keep the existing json.loads(raw) path for the normal
case where setting is returned.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0e2f82f1-3f56-43d3-8850-37bfdb5bf44d

📥 Commits

Reviewing files that changed from the base of the PR and between bad9a43 and a13ea8f.

📒 Files selected for processing (34)
  • .github/workflows/dast.yml
  • CLAUDE.md
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/observability/events/security.py
  • src/synthorg/persistence/protocol.py
  • src/synthorg/persistence/risk_override_repo.py
  • src/synthorg/persistence/sqlite/backend.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
  • src/synthorg/persistence/sqlite/schema.sql
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/ssrf_violation_repo.py
  • src/synthorg/security/rules/risk_override.py
  • src/synthorg/security/ssrf_violation.py
  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/tools/sandbox/credential_manager.py
  • src/synthorg/tools/sandbox/docker_config.py
  • src/synthorg/tools/sandbox/docker_sandbox.py
  • src/synthorg/tools/sandbox/factory.py
  • src/synthorg/tools/sandbox/policy.py
  • src/synthorg/tools/sandbox/runtime_resolver.py
  • tests/unit/api/fakes_backend.py
  • tests/unit/persistence/sqlite/test_migrations.py
  • tests/unit/persistence/test_protocol.py
  • tests/unit/security/test_risk_override.py
  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/settings/test_security_subscriber.py
  • tests/unit/tools/sandbox/test_auth_proxy.py
  • tests/unit/tools/sandbox/test_credential_manager.py
  • tests/unit/tools/sandbox/test_docker_config.py
  • tests/unit/tools/sandbox/test_policy.py
  • tests/unit/tools/sandbox/test_runtime_resolver.py
  • tests/unit/tools/sandbox/test_sandbox_factory.py

Comment on lines +126 to +133
except (sqlite3.Error, aiosqlite.Error) as exc:
await self._rollback_quietly()
msg = f"Failed to save risk override: {exc}"
logger.exception(
PERSISTENCE_RISK_OVERRIDE_SAVE_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) for exception handling in Python 3.14 per PEP 758."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/risk_override_repo.py` around lines 126 -
133, Change the multi-exception except clause to use PEP 758 comma syntax (no
parentheses): replace "except (sqlite3.Error, aiosqlite.Error) as exc:" with the
comma-separated form so the handler that calls self._rollback_quietly(), logs
via logger.exception(PERSISTENCE_RISK_OVERRIDE_SAVE_FAILED, error=msg) and
re-raises PersistenceError(msg) from exc still runs for both sqlite3.Error and
aiosqlite.Error; keep the existing rollback, logging
(PERSISTENCE_RISK_OVERRIDE_SAVE_FAILED) and raise behavior unchanged.

Comment on lines +152 to +158
except (sqlite3.Error, aiosqlite.Error) as exc:
msg = f"Failed to get risk override: {exc}"
logger.exception(
PERSISTENCE_RISK_OVERRIDE_QUERY_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) for exception handling in Python 3.14 per PEP 758."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except (sqlite3.Error, aiosqlite.Error) as exc:
msg = f"Failed to get risk override: {exc}"
logger.exception(
PERSISTENCE_RISK_OVERRIDE_QUERY_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
except sqlite3.Error, aiosqlite.Error as exc:
msg = f"Failed to get risk override: {exc}"
logger.exception(
PERSISTENCE_RISK_OVERRIDE_QUERY_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/risk_override_repo.py` around lines 152 -
158, Change the except clause that currently reads except (sqlite3.Error,
aiosqlite.Error) as exc to use the PEP 758 style without parentheses (i.e.,
except sqlite3.Error, aiosqlite.Error as exc) so the handler in the block that
logs PERSISTENCE_RISK_OVERRIDE_QUERY_FAILED via logger.exception and raises
PersistenceError(msg) from exc uses the required syntax; leave the body (msg
creation, logger.exception call, and raise PersistenceError) unchanged.

Comment on lines +175 to +181
except (sqlite3.Error, aiosqlite.Error) as exc:
msg = f"Failed to list active overrides: {exc}"
logger.exception(
PERSISTENCE_RISK_OVERRIDE_QUERY_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) for exception handling in Python 3.14 per PEP 758."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except (sqlite3.Error, aiosqlite.Error) as exc:
msg = f"Failed to list active overrides: {exc}"
logger.exception(
PERSISTENCE_RISK_OVERRIDE_QUERY_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
except sqlite3.Error, aiosqlite.Error as exc:
msg = f"Failed to list active overrides: {exc}"
logger.exception(
PERSISTENCE_RISK_OVERRIDE_QUERY_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/risk_override_repo.py` around lines 175 -
181, Change the multi-exception tuple syntax in the except clause to the PEP 758
form: replace the current "except (sqlite3.Error, aiosqlite.Error) as exc" with
the PEP 758 style "except sqlite3.Error, aiosqlite.Error as exc" in the error
handling block inside the function that lists active overrides (the block that
logs PERSISTENCE_RISK_OVERRIDE_QUERY_FAILED and raises PersistenceError); keep
the logger.exception and raise logic unchanged.

Comment on lines +213 to +220
except (sqlite3.Error, aiosqlite.Error) as exc:
await self._rollback_quietly()
msg = f"Failed to revoke risk override: {exc}"
logger.exception(
PERSISTENCE_RISK_OVERRIDE_SAVE_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) for exception handling in Python 3.14 per PEP 758."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except (sqlite3.Error, aiosqlite.Error) as exc:
await self._rollback_quietly()
msg = f"Failed to revoke risk override: {exc}"
logger.exception(
PERSISTENCE_RISK_OVERRIDE_SAVE_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
except sqlite3.Error, aiosqlite.Error as exc:
await self._rollback_quietly()
msg = f"Failed to revoke risk override: {exc}"
logger.exception(
PERSISTENCE_RISK_OVERRIDE_SAVE_FAILED,
error=msg,
)
raise PersistenceError(msg) from exc
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/risk_override_repo.py` around lines 213 -
220, Replace the parenthesized multi-except with the PEP 758 style: change the
exception clause in the block that calls self._rollback_quietly() and raises
PersistenceError so it uses "except sqlite3.Error, aiosqlite.Error as exc:"
(referencing the same logger.exception(PERSISTENCE_RISK_OVERRIDE_SAVE_FAILED,
...), msg creation, and raise PersistenceError(msg) from exc) to comply with the
guideline; keep the existing rollback, logging (including msg and error=msg),
and re-raise behavior unchanged.

Aureliolo added 14 commits April 7, 2026 12:18
…y, credential manager (#696)

- Add runtime_overrides to DockerSandboxConfig for per-category
  container runtime selection (parallel to network_overrides)
- Create SandboxRuntimeResolver with Docker daemon probing and
  automatic runc fallback when gVisor unavailable
- Inject default gVisor overrides for code_execution and terminal
  categories via merge_gvisor_defaults()
- Create SandboxPolicy 4-domain model (filesystem, network, process,
  inference) as optional overlay on DockerSandboxConfig
- Create SandboxCredentialManager for env sanitization (strips
  credential-like keys from sandbox env_overrides)
- Create SandboxAuthProxy interface (NotImplementedError stub for
  future aiohttp proxy server)
- Create RiskTierOverride model with mandatory expiration, revocation,
  and cross-tier validation (frozen Pydantic, audit-ready)
- Create SecOpsRiskClassifier wrapping RiskClassifier with runtime
  override support (last-added-wins for same action_type)
- Create RiskOverrideRepository protocol and SQLite implementation
  with active-only queries and atomic revocation
- Add risk_overrides table to schema.sql with action_type and
  expires_at indexes
- Add security and persistence event constants
- Create SsrfViolation model with status enum (pending/allowed/denied)
  and resolution tracking (frozen Pydantic, validation enforced)
- Create SsrfViolationRepository protocol and SQLite implementation
  with status-filtered queries and atomic status updates
- Add ssrf_violations table to schema.sql with status and timestamp
  indexes
- Register security/discovery_allowlist setting for hot-reload of
  provider discovery SSRF allowlist
- Create SecuritySubscriber watching discovery_allowlist changes
- Add risk_overrides and ssrf_violations to PersistenceBackend
  protocol and SQLite backend
- Add persistence event constants for both repositories
…, fix mypy

- Add missing protocol members to FakePersistenceBackend (fakes_backend.py)
  and _FakeBackend (test_protocol.py)
- Fix _row_to_override/_row_to_violation type annotations (Any vs tuple)
- Fix StrEnum comparison-overlap in test_ssrf_violation.py
- Use Any return type for stub risk_overrides/ssrf_violations properties
- Fix tuple type annotation in ssrf_violation_repo list_violations
- Add Any import to test_protocol.py
Pre-reviewed by 7 agents, 28 findings addressed:

Security: implement SecuritySubscriber callback, remove duplicate
discovery_allowlist setting, add 8 credential patterns, default
route_through_proxy to False, reject PENDING in update_status.

Persistence: wire shared write lock, use sqlite_errorname for UNIQUE
detection, add rollback before DuplicateRecordError, guard naive
datetimes via _ensure_utc, add partial composite indexes, add CHECK
constraint on ssrf_violations.status, skip-and-log deserialization.

Models: action_type to NotBlankStr, paired-nullability validator for
revoked_at/by, revoke_override accepts revoked_by param, resolved_ip
to NotBlankStr, MappingProxyType for _DEFAULT_GVISOR_OVERRIDES.

Conventions: vendor name in test, CLAUDE.md Package Structure update.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (14)
src/synthorg/settings/subscribers/security_subscriber.py (2)

101-108: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

The parenthesized form except (json.JSONDecodeError, TypeError) violates the project's Python 3.14 PEP 758 convention.

✏️ Proposed fix
-        except (json.JSONDecodeError, TypeError) as exc:
+        except json.JSONDecodeError, TypeError as exc:

As per coding guidelines: "Use except A, B: syntax without parentheses -- ruff enforces this on Python 3.14."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/settings/subscribers/security_subscriber.py` around lines 101 -
108, Change the exception clause to use PEP 758 unparenthesized syntax: replace
the current "except (json.JSONDecodeError, TypeError) as exc" with the
unparenthesized form "except json.JSONDecodeError, TypeError as exc" so the
except clause around the logger.warning call (which logs
SECURITY_ALLOWLIST_UPDATE_FAILED with namespace, key and error=f"failed to parse
allowlist: {exc}") conforms to the project's Python 3.14 convention.

78-81: ⚠️ Potential issue | 🟡 Minor

Unreachable fallback—get() raises on missing key, never returns None.

Per the SettingsService.get() contract (context snippet 1), the method raises SettingNotFoundError when the key isn't registered—it never returns None. The else "[]" fallback on line 80 is dead code, and SettingNotFoundError will propagate unhandled if the setting isn't registered.

If fail-closed behavior is intended, catch SettingNotFoundError explicitly and log it. Otherwise, simplify to raw = setting.value if the setting is guaranteed to be registered.

🛡️ Option: Fail-closed with explicit handling
+        from synthorg.settings.errors import SettingNotFoundError
+
         try:
             setting = await self._settings_service.get(namespace, key)
-            raw = setting.value if setting is not None else "[]"
+            raw = setting.value
             entries = json.loads(raw)
+        except SettingNotFoundError:
+            logger.warning(
+                SECURITY_ALLOWLIST_UPDATE_FAILED,
+                namespace=namespace,
+                key=key,
+                error="setting not registered",
+            )
+            return
-        except (json.JSONDecodeError, TypeError) as exc:
+        except json.JSONDecodeError, TypeError as exc:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/settings/subscribers/security_subscriber.py` around lines 78 -
81, The current try block uses self._settings_service.get(namespace, key) and
then uses "setting is not None else '[]'" which is unreachable because
SettingsService.get raises SettingNotFoundError on missing keys; either catch
SettingNotFoundError explicitly around the call to self._settings_service.get
and handle it (e.g., set raw = "[]" and log the missing setting) to implement
fail-closed behavior, or remove the unreachable fallback and directly use raw =
setting.value if you guarantee the setting exists; refer to
_settings_service.get, SettingNotFoundError, setting, raw, and entries when
making the change.
src/synthorg/persistence/sqlite/risk_override_repo.py (4)

152-158: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) — ruff enforces this on Python 3.14."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/risk_override_repo.py` around lines 152 -
158, The except clause uses parentheses around the exception tuple; change
"except (sqlite3.Error, aiosqlite.Error) as exc:" to the PEP‑758 form "except
sqlite3.Error, aiosqlite.Error as exc:" in the block that logs
PERSISTENCE_RISK_OVERRIDE_QUERY_FAILED and raises PersistenceError so the
logger.exception and raise PersistenceError(msg) from exc behavior is preserved.

213-220: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) — ruff enforces this on Python 3.14."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/risk_override_repo.py` around lines 213 -
220, The except clause uses parenthesized exception tuple; update the except
syntax to the PEP 758 form used by ruff by changing "except (sqlite3.Error,
aiosqlite.Error) as exc:" to "except sqlite3.Error, aiosqlite.Error as exc:" in
the method that calls self._rollback_quietly() and raises PersistenceError with
PERSISTENCE_RISK_OVERRIDE_SAVE_FAILED; ensure logger.exception and raising
PersistenceError(msg) from exc remain unchanged.

175-181: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) — ruff enforces this on Python 3.14."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/risk_override_repo.py` around lines 175 -
181, Change the tuple-style except to PEP 758 syntax by replacing "except
(sqlite3.Error, aiosqlite.Error) as exc:" with "except sqlite3.Error,
aiosqlite.Error as exc:" in the exception handling that logs
PERSISTENCE_RISK_OVERRIDE_QUERY_FAILED and raises PersistenceError; keep the
logger.exception call and the raised PersistenceError unchanged (i.e., only
remove the parentheses around the exception types).

126-133: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) — ruff enforces this on Python 3.14."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/risk_override_repo.py` around lines 126 -
133, Change the except clause to use PEP 758 comma-separated syntax (no
parentheses) so the exception handler reads like "except sqlite3.Error,
aiosqlite.Error as exc:"; locate the block that calls await
self._rollback_quietly(), logs PERSISTENCE_RISK_OVERRIDE_SAVE_FAILED via
logger.exception and raises PersistenceError(msg) and replace the current
"except (sqlite3.Error, aiosqlite.Error) as exc:" with the new comma-separated
form.
src/synthorg/tools/sandbox/docker_sandbox.py (2)

239-243: ⚠️ Potential issue | 🟠 Major

Credential sanitization may strip legitimate environment variables.

The SandboxCredentialManager.sanitize_env() call on line 240 applies pattern-based filtering using broad substrings like token and secret. This can inadvertently remove non-secret variables such as MAX_TOKENS or TOKENIZER_CACHE.

Since this path uses sanitize_env() rather than sanitize_env_with_report(), removals are silent. Consider either:

  1. Using sanitize_env_with_report() to log which keys were stripped
  2. Tightening the credential patterns to use word boundaries or exact prefixes
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/tools/sandbox/docker_sandbox.py` around lines 239 - 243, The
call to SandboxCredentialManager.sanitize_env(env_overrides) in
docker_sandbox.py may silently strip benign keys (e.g., MAX_TOKENS) due to broad
patterns; update the code to either call sanitize_env_with_report(env_overrides)
and log/report removed keys so stripping is visible, or tighten the
credential-matching logic inside SandboxCredentialManager (replace substring
matches like "token"/"secret" with word-boundary/ exact-prefix rules) so only
true secrets are removed; locate usages around the sanitized = ... assignment
and the implementations of sanitize_env and sanitize_env_with_report to apply
the change.

313-321: ⚠️ Potential issue | 🟠 Major

Runtime resolver cannot be invoked by default callers due to empty category guard.

The execute() method defaults category to an empty string, and both code_runner.py:117 and _git_base.py:530 omit this parameter entirely. However, _resolve_runtime() checks if self._runtime_resolver is not None and category: (line 319), which short-circuits when category is empty. This means the resolver is never consulted for these high-risk execution paths, preventing any runsc overrides from being applied.

Fix by either:

  1. Making category required when a resolver is attached (fail-closed), or
  2. Removing the and category guard so the resolver is always consulted when attached (only check if resolver exists)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/tools/sandbox/docker_sandbox.py` around lines 313 - 321, The
category-empty guard prevents the SandboxRuntimeResolver from ever being used
for default/omitted callers; update _resolve_runtime to consult
self._runtime_resolver whenever it is set (remove the "and category" check) so
resolve_runtime(category) is invoked even when category is an empty string, and
ensure callers handle a None return by falling back to self._config.runtime;
alternatively enforce that execute() and callers supply a non-empty category
when a resolver is attached, but prefer removing the category truthiness guard
in _resolve_runtime so runsc overrides are applied for default execution paths.
src/synthorg/persistence/sqlite/ssrf_violation_repo.py (5)

124-131: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) — ruff enforces this on Python 3.14."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 124 -
131, Change the multi-exception tuple syntax in the except clause to PEP 758
style: replace "except (sqlite3.Error, aiosqlite.Error) as exc" with "except
sqlite3.Error, aiosqlite.Error as exc" in the save flow inside
ssrf_violation_repo (the block that calls self._rollback_quietly(), logs via
logger.exception with PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED, and raises
PersistenceError) so the handler catches both sqlite3.Error and aiosqlite.Error
using the new syntax.

252-259: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) — ruff enforces this on Python 3.14."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 252 -
259, The except clause in the SSRF violation update block uses parentheses;
change "except (sqlite3.Error, aiosqlite.Error) as exc" to the PEP 758 form
"except sqlite3.Error, aiosqlite.Error as exc" in the method that calls await
self._rollback_quietly(), keeping the rest of the handler intact (the
rollback_quietly call, constructing msg, logger.exception with
PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED, and raising PersistenceError(msg) from
exc) so behavior is unchanged but the syntax conforms to the guideline; update
only the except line in ssrf_violation_repo.py where that block appears.

162-169: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (ValueError, ValidationError) as exc:
+        except ValueError, ValidationError as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) — ruff enforces this on Python 3.14."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 162 -
169, Replace the parenthesized except clause with the PEP 758 style: change
"except (ValueError, ValidationError) as exc:" to "except ValueError,
ValidationError as exc:" in the block that builds msg and calls logger.exception
with PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED (the block that references
violation_id, logger.exception(...) and raises PersistenceError(msg) from exc).
Keep the existing error handling, logging fields (error=msg,
violation_id=violation_id) and the chained raise intact.

198-204: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) — ruff enforces this on Python 3.14."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 198 -
204, The except clause currently uses parentheses around multiple exceptions;
change the clause handling sqlite3.Error and aiosqlite.Error in the SSRF
violation listing block to use the PEP 758 style without parentheses (i.e., use
"except sqlite3.Error, aiosqlite.Error as exc:"), keeping the body intact (msg
creation, logger.exception with PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED, and
raising PersistenceError(msg) from exc) so the same error handling behavior is
preserved.

150-156: ⚠️ Potential issue | 🟡 Minor

Use PEP 758 except syntax without parentheses.

✏️ Proposed fix
-        except (sqlite3.Error, aiosqlite.Error) as exc:
+        except sqlite3.Error, aiosqlite.Error as exc:

As per coding guidelines: "Use except A, B: syntax (no parentheses) — ruff enforces this on Python 3.14."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py` around lines 150 -
156, Replace the tuple-except syntax with the new PEP 758 form: change the
except line that currently reads "except (sqlite3.Error, aiosqlite.Error) as
exc:" to use the no-parentheses form "except sqlite3.Error, aiosqlite.Error as
exc:" in the block that logs PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED and raises
PersistenceError so the exception handler uses the mandated syntax while keeping
the existing logger.exception(...) and raise PersistenceError(msg) from exc
behavior unchanged.
src/synthorg/security/rules/risk_override.py (1)

29-29: ⚠️ Potential issue | 🔴 Critical

NotBlankStr is a type alias, not a constructor—this will fail at import.

NotBlankStr is an Annotated type alias and cannot be called. This line raises TypeError when the module is imported.

🐛 Proposed fix
-_DEFAULT_REVOKED_BY = NotBlankStr("system")
+_DEFAULT_REVOKED_BY: NotBlankStr = "system"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/security/rules/risk_override.py` at line 29, The code assigns
_DEFAULT_REVOKED_BY by calling NotBlankStr (a type alias) which raises
TypeError; change the assignment to use a proper typed value instead of a
call—for example declare _DEFAULT_REVOKED_BY: NotBlankStr = "system" (or use
typing.cast(NotBlankStr, "system")) so _DEFAULT_REVOKED_BY is a string with the
NotBlankStr annotation rather than an attempted constructor call.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/persistence/sqlite/schema.sql`:
- Around line 550-553: The partial index idx_ro_active on risk_overrides
currently uses (expires_at, created_at DESC); to let SQLite avoid an extra sort
for the frequent list_active() query in risk_override_repo.py (the list_active()
function filters with "expires_at > ?" and orders by created_at DESC), change
the index column order to (created_at DESC, expires_at) while keeping the WHERE
revoked_at IS NULL predicate so the planner can scan in created_at order and
apply the expires_at filter without a separate sort.

In `@src/synthorg/tools/sandbox/docker_config.py`:
- Around line 87-94: The policy Field (policy: SandboxPolicy) in
docker_config.py is dead: it claims to be "consumed by the sandbox execution
layer" but is never read by DockerSandbox or SandboxRuntimeResolver; either
remove the Field and its docstring to avoid misleading API surface, or actually
thread the policy into runtime by adding a parameter to DockerSandbox (e.g.,
accept a SandboxPolicy in its constructor) and ensure SandboxRuntimeResolver
passes the configured policy when constructing DockerSandbox and any
executor/apply_policy method applies constraints at container startup; update
tests and type hints for SandboxPolicy, DockerSandbox, and
SandboxRuntimeResolver accordingly.

In `@src/synthorg/tools/sandbox/factory.py`:
- Around line 39-44: Change the typing and values of _DEFAULT_GVISOR_OVERRIDES
to use NotBlankStr instead of plain str so it matches
DockerSandboxConfig.runtime_overrides; specifically, import NotBlankStr and
update the MappingProxyType annotation and the literal keys/values
("code_execution", "terminal", "runsc") to be NotBlankStr instances (matching
the pattern used for _DEFAULT_REVOKED_BY) so mypy and type checks align with the
model's dict[NotBlankStr, NotBlankStr] expectation.

In `@tests/unit/security/test_ssrf_violation.py`:
- Around line 27-37: Replace vendor-specific provider names used in the tests
with a generic test provider name: update the SsrfViolation instances (e.g., in
test_pending_violation and the other test that sets provider_name to "ollama")
to use a vendor-agnostic string like "test-provider" (or "test-small-001") so
the tests follow the project’s vendor-neutral naming rule; locate occurrences by
searching for SsrfViolation(... provider_name= or direct "ollama" usages and
change them accordingly.

In `@tests/unit/tools/sandbox/test_credential_manager.py`:
- Around line 40-44: The test test_strips_credential_vars uses a vendor-specific
key "AWS_CREDENTIAL"; update the test to use a vendor-agnostic key (e.g.,
"CREDENTIAL" or "SOME_CREDENTIAL") in both the env input and the assertion so it
still verifies that SandboxCredentialManager.sanitize_env removes
credential-like env vars; change the env dict key and the assert to reference
the new generic key and keep the rest of the test unchanged.

In `@tests/unit/tools/sandbox/test_sandbox_factory.py`:
- Around line 283-331: The new TestMergeGvisorDefaults class was inserted
between TestCleanupSandboxBackends and its final method, leaving
test_cleanup_continues_on_error orphaned; fix by relocating tests so grouping is
preserved—either move the entire TestMergeGvisorDefaults class to after all
TestCleanupSandboxBackends tests, or move the orphaned
test_cleanup_continues_on_error back into the TestCleanupSandboxBackends class;
update the file so TestCleanupSandboxBackends contains
test_cleanup_continues_on_error (or place TestMergeGvisorDefaults at file end)
ensuring class membership is restored.

---

Duplicate comments:
In `@src/synthorg/persistence/sqlite/risk_override_repo.py`:
- Around line 152-158: The except clause uses parentheses around the exception
tuple; change "except (sqlite3.Error, aiosqlite.Error) as exc:" to the PEP‑758
form "except sqlite3.Error, aiosqlite.Error as exc:" in the block that logs
PERSISTENCE_RISK_OVERRIDE_QUERY_FAILED and raises PersistenceError so the
logger.exception and raise PersistenceError(msg) from exc behavior is preserved.
- Around line 213-220: The except clause uses parenthesized exception tuple;
update the except syntax to the PEP 758 form used by ruff by changing "except
(sqlite3.Error, aiosqlite.Error) as exc:" to "except sqlite3.Error,
aiosqlite.Error as exc:" in the method that calls self._rollback_quietly() and
raises PersistenceError with PERSISTENCE_RISK_OVERRIDE_SAVE_FAILED; ensure
logger.exception and raising PersistenceError(msg) from exc remain unchanged.
- Around line 175-181: Change the tuple-style except to PEP 758 syntax by
replacing "except (sqlite3.Error, aiosqlite.Error) as exc:" with "except
sqlite3.Error, aiosqlite.Error as exc:" in the exception handling that logs
PERSISTENCE_RISK_OVERRIDE_QUERY_FAILED and raises PersistenceError; keep the
logger.exception call and the raised PersistenceError unchanged (i.e., only
remove the parentheses around the exception types).
- Around line 126-133: Change the except clause to use PEP 758 comma-separated
syntax (no parentheses) so the exception handler reads like "except
sqlite3.Error, aiosqlite.Error as exc:"; locate the block that calls await
self._rollback_quietly(), logs PERSISTENCE_RISK_OVERRIDE_SAVE_FAILED via
logger.exception and raises PersistenceError(msg) and replace the current
"except (sqlite3.Error, aiosqlite.Error) as exc:" with the new comma-separated
form.

In `@src/synthorg/persistence/sqlite/ssrf_violation_repo.py`:
- Around line 124-131: Change the multi-exception tuple syntax in the except
clause to PEP 758 style: replace "except (sqlite3.Error, aiosqlite.Error) as
exc" with "except sqlite3.Error, aiosqlite.Error as exc" in the save flow inside
ssrf_violation_repo (the block that calls self._rollback_quietly(), logs via
logger.exception with PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED, and raises
PersistenceError) so the handler catches both sqlite3.Error and aiosqlite.Error
using the new syntax.
- Around line 252-259: The except clause in the SSRF violation update block uses
parentheses; change "except (sqlite3.Error, aiosqlite.Error) as exc" to the PEP
758 form "except sqlite3.Error, aiosqlite.Error as exc" in the method that calls
await self._rollback_quietly(), keeping the rest of the handler intact (the
rollback_quietly call, constructing msg, logger.exception with
PERSISTENCE_SSRF_VIOLATION_SAVE_FAILED, and raising PersistenceError(msg) from
exc) so behavior is unchanged but the syntax conforms to the guideline; update
only the except line in ssrf_violation_repo.py where that block appears.
- Around line 162-169: Replace the parenthesized except clause with the PEP 758
style: change "except (ValueError, ValidationError) as exc:" to "except
ValueError, ValidationError as exc:" in the block that builds msg and calls
logger.exception with PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED (the block that
references violation_id, logger.exception(...) and raises PersistenceError(msg)
from exc). Keep the existing error handling, logging fields (error=msg,
violation_id=violation_id) and the chained raise intact.
- Around line 198-204: The except clause currently uses parentheses around
multiple exceptions; change the clause handling sqlite3.Error and
aiosqlite.Error in the SSRF violation listing block to use the PEP 758 style
without parentheses (i.e., use "except sqlite3.Error, aiosqlite.Error as exc:"),
keeping the body intact (msg creation, logger.exception with
PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED, and raising PersistenceError(msg) from
exc) so the same error handling behavior is preserved.
- Around line 150-156: Replace the tuple-except syntax with the new PEP 758
form: change the except line that currently reads "except (sqlite3.Error,
aiosqlite.Error) as exc:" to use the no-parentheses form "except sqlite3.Error,
aiosqlite.Error as exc:" in the block that logs
PERSISTENCE_SSRF_VIOLATION_QUERY_FAILED and raises PersistenceError so the
exception handler uses the mandated syntax while keeping the existing
logger.exception(...) and raise PersistenceError(msg) from exc behavior
unchanged.

In `@src/synthorg/security/rules/risk_override.py`:
- Line 29: The code assigns _DEFAULT_REVOKED_BY by calling NotBlankStr (a type
alias) which raises TypeError; change the assignment to use a proper typed value
instead of a call—for example declare _DEFAULT_REVOKED_BY: NotBlankStr =
"system" (or use typing.cast(NotBlankStr, "system")) so _DEFAULT_REVOKED_BY is a
string with the NotBlankStr annotation rather than an attempted constructor
call.

In `@src/synthorg/settings/subscribers/security_subscriber.py`:
- Around line 101-108: Change the exception clause to use PEP 758
unparenthesized syntax: replace the current "except (json.JSONDecodeError,
TypeError) as exc" with the unparenthesized form "except json.JSONDecodeError,
TypeError as exc" so the except clause around the logger.warning call (which
logs SECURITY_ALLOWLIST_UPDATE_FAILED with namespace, key and error=f"failed to
parse allowlist: {exc}") conforms to the project's Python 3.14 convention.
- Around line 78-81: The current try block uses
self._settings_service.get(namespace, key) and then uses "setting is not None
else '[]'" which is unreachable because SettingsService.get raises
SettingNotFoundError on missing keys; either catch SettingNotFoundError
explicitly around the call to self._settings_service.get and handle it (e.g.,
set raw = "[]" and log the missing setting) to implement fail-closed behavior,
or remove the unreachable fallback and directly use raw = setting.value if you
guarantee the setting exists; refer to _settings_service.get,
SettingNotFoundError, setting, raw, and entries when making the change.

In `@src/synthorg/tools/sandbox/docker_sandbox.py`:
- Around line 239-243: The call to
SandboxCredentialManager.sanitize_env(env_overrides) in docker_sandbox.py may
silently strip benign keys (e.g., MAX_TOKENS) due to broad patterns; update the
code to either call sanitize_env_with_report(env_overrides) and log/report
removed keys so stripping is visible, or tighten the credential-matching logic
inside SandboxCredentialManager (replace substring matches like "token"/"secret"
with word-boundary/ exact-prefix rules) so only true secrets are removed; locate
usages around the sanitized = ... assignment and the implementations of
sanitize_env and sanitize_env_with_report to apply the change.
- Around line 313-321: The category-empty guard prevents the
SandboxRuntimeResolver from ever being used for default/omitted callers; update
_resolve_runtime to consult self._runtime_resolver whenever it is set (remove
the "and category" check) so resolve_runtime(category) is invoked even when
category is an empty string, and ensure callers handle a None return by falling
back to self._config.runtime; alternatively enforce that execute() and callers
supply a non-empty category when a resolver is attached, but prefer removing the
category truthiness guard in _resolve_runtime so runsc overrides are applied for
default execution paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8f5df21e-870c-4596-88f9-561d0a5d39a1

📥 Commits

Reviewing files that changed from the base of the PR and between 49c4b14 and d38ce3d.

📒 Files selected for processing (34)
  • .github/workflows/dast.yml
  • CLAUDE.md
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/observability/events/sandbox.py
  • src/synthorg/observability/events/security.py
  • src/synthorg/persistence/protocol.py
  • src/synthorg/persistence/risk_override_repo.py
  • src/synthorg/persistence/sqlite/backend.py
  • src/synthorg/persistence/sqlite/risk_override_repo.py
  • src/synthorg/persistence/sqlite/schema.sql
  • src/synthorg/persistence/sqlite/ssrf_violation_repo.py
  • src/synthorg/persistence/ssrf_violation_repo.py
  • src/synthorg/security/rules/risk_override.py
  • src/synthorg/security/ssrf_violation.py
  • src/synthorg/settings/subscribers/security_subscriber.py
  • src/synthorg/tools/sandbox/auth_proxy.py
  • src/synthorg/tools/sandbox/credential_manager.py
  • src/synthorg/tools/sandbox/docker_config.py
  • src/synthorg/tools/sandbox/docker_sandbox.py
  • src/synthorg/tools/sandbox/factory.py
  • src/synthorg/tools/sandbox/policy.py
  • src/synthorg/tools/sandbox/runtime_resolver.py
  • tests/unit/api/fakes_backend.py
  • tests/unit/persistence/sqlite/test_migrations.py
  • tests/unit/persistence/test_protocol.py
  • tests/unit/security/test_risk_override.py
  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/settings/test_security_subscriber.py
  • tests/unit/tools/sandbox/test_auth_proxy.py
  • tests/unit/tools/sandbox/test_credential_manager.py
  • tests/unit/tools/sandbox/test_docker_config.py
  • tests/unit/tools/sandbox/test_policy.py
  • tests/unit/tools/sandbox/test_runtime_resolver.py
  • tests/unit/tools/sandbox/test_sandbox_factory.py

Comment on lines +87 to +94
policy: SandboxPolicy | None = Field(
default=None,
description=(
"Structured 4-domain policy overlay (filesystem, network, "
"process, inference). Consumed by the sandbox execution "
"layer to apply domain-specific constraints at runtime."
),
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify that DockerSandbox or related components read config.policy

echo "=== Checking usage of config.policy or self._config.policy ==="
rg -n --type=py '\.policy\b' src/synthorg/tools/sandbox/

echo ""
echo "=== Checking if SandboxPolicy is referenced in docker_sandbox.py ==="
rg -n 'SandboxPolicy|\.policy' src/synthorg/tools/sandbox/docker_sandbox.py

Repository: Aureliolo/synthorg

Length of output: 415


🏁 Script executed:

# Check DockerSandbox implementation for policy usage
cat -n src/synthorg/tools/sandbox/docker_sandbox.py | head -100

# Search for SandboxRuntimeResolver
echo ""
echo "=== Searching for SandboxRuntimeResolver ==="
rg -n 'SandboxRuntimeResolver' src/synthorg/

# Check for TODOs or FIXMEs related to policy
echo ""
echo "=== Checking for TODOs/FIXMEs related to policy ==="
rg -n -i 'TODO|FIXME' src/synthorg/tools/sandbox/ | grep -i policy

Repository: Aureliolo/synthorg

Length of output: 4359


🏁 Script executed:

# View DockerSandbox.__init__ full implementation
sed -n '97,150p' src/synthorg/tools/sandbox/docker_sandbox.py

# View SandboxRuntimeResolver implementation
echo ""
echo "=== SandboxRuntimeResolver ==="
head -100 src/synthorg/tools/sandbox/runtime_resolver.py

# Search for where policy attribute is accessed/used
echo ""
echo "=== Searching for config.policy access ==="
rg -n 'config\.policy\|self\._config\.policy\|_config\.policy' src/synthorg/tools/sandbox/

Repository: Aureliolo/synthorg

Length of output: 5374


🏁 Script executed:

# View rest of SandboxRuntimeResolver to confirm no policy usage
tail -100 src/synthorg/tools/sandbox/runtime_resolver.py

# Search for any usage of policy in entire docker_sandbox.py
echo ""
echo "=== Full search for 'policy' in docker_sandbox.py ==="
rg -n 'policy' src/synthorg/tools/sandbox/docker_sandbox.py

# Check if policy is used in execute or other key methods
echo ""
echo "=== Check execute method and other methods ==="
grep -n 'def execute\|def _execute\|def apply\|def enforce' src/synthorg/tools/sandbox/docker_sandbox.py

Repository: Aureliolo/synthorg

Length of output: 3583


Remove unused policy field or implement its consumption.

The policy field is defined with a docstring claiming it's "consumed by the sandbox execution layer," but verification shows it is never read by DockerSandbox, SandboxRuntimeResolver, or any other runtime component. Either implement the feature to actually apply policy constraints at runtime, or remove the field as unused dead code. The current state—a defined field with a misleading docstring—creates false expectations about runtime behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/tools/sandbox/docker_config.py` around lines 87 - 94, The policy
Field (policy: SandboxPolicy) in docker_config.py is dead: it claims to be
"consumed by the sandbox execution layer" but is never read by DockerSandbox or
SandboxRuntimeResolver; either remove the Field and its docstring to avoid
misleading API surface, or actually thread the policy into runtime by adding a
parameter to DockerSandbox (e.g., accept a SandboxPolicy in its constructor) and
ensure SandboxRuntimeResolver passes the configured policy when constructing
DockerSandbox and any executor/apply_policy method applies constraints at
container startup; update tests and type hints for SandboxPolicy, DockerSandbox,
and SandboxRuntimeResolver accordingly.

Comment on lines +39 to +44
_DEFAULT_GVISOR_OVERRIDES: MappingProxyType[str, str] = MappingProxyType(
{
"code_execution": "runsc",
"terminal": "runsc",
}
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the runtime_overrides field type in DockerSandboxConfig
rg -n "runtime_overrides.*Field" --type=py -C2 src/synthorg/tools/sandbox/

Repository: Aureliolo/synthorg

Length of output: 518


🏁 Script executed:

#!/bin/bash
# Find and examine NotBlankStr definition
rg -n "NotBlankStr" src/synthorg/core/types.py -A5 -B2 | head -50

Repository: Aureliolo/synthorg

Length of output: 279


🏁 Script executed:

#!/bin/bash
# Look for examples of NotBlankStr usage in the codebase
rg -n "NotBlankStr\(" src/synthorg --type=py | head -20

Repository: Aureliolo/synthorg

Length of output: 1798


🏁 Script executed:

#!/bin/bash
# Check merge_gvisor_defaults function to understand how values are used
rg -n "merge_gvisor_defaults" src/synthorg/tools/sandbox/factory.py -A30 | head -60

Repository: Aureliolo/synthorg

Length of output: 1193


🏁 Script executed:

#!/bin/bash
# Check the _DEFAULT_REVOKED_BY pattern for similar usage
rg -n "_DEFAULT_REVOKED_BY" src/synthorg/security/rules/risk_override.py -B2 -A10

Repository: Aureliolo/synthorg

Length of output: 900


Use NotBlankStr for _DEFAULT_GVISOR_OVERRIDES keys and values for type consistency.

The runtime_overrides field in DockerSandboxConfig is typed as dict[NotBlankStr, NotBlankStr]. The _DEFAULT_GVISOR_OVERRIDES mapping currently uses plain str values, which works due to Pydantic coercion during model_copy, but misaligns the type annotation. Since "code_execution" and "terminal" are identifier fields, use NotBlankStr consistently:

Suggested change
+from synthorg.core.types import NotBlankStr
+
-_DEFAULT_GVISOR_OVERRIDES: MappingProxyType[str, str] = MappingProxyType(
+_DEFAULT_GVISOR_OVERRIDES: MappingProxyType[NotBlankStr, NotBlankStr] = MappingProxyType(
     {
-        "code_execution": "runsc",
-        "terminal": "runsc",
+        NotBlankStr("code_execution"): NotBlankStr("runsc"),
+        NotBlankStr("terminal"): NotBlankStr("runsc"),
     }
 )

This pattern follows the codebase convention (see _DEFAULT_REVOKED_BY in security/rules/risk_override.py) and aligns with mypy strict mode expectations.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/tools/sandbox/factory.py` around lines 39 - 44, Change the
typing and values of _DEFAULT_GVISOR_OVERRIDES to use NotBlankStr instead of
plain str so it matches DockerSandboxConfig.runtime_overrides; specifically,
import NotBlankStr and update the MappingProxyType annotation and the literal
keys/values ("code_execution", "terminal", "runsc") to be NotBlankStr instances
(matching the pattern used for _DEFAULT_REVOKED_BY) so mypy and type checks
align with the model's dict[NotBlankStr, NotBlankStr] expectation.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (1)
src/synthorg/tools/sandbox/docker_sandbox.py (1)

239-245: ⚠️ Potential issue | 🟠 Major

This still strips ordinary env vars like MAX_TOKENS.

sanitize_env() runs on every override here, but SandboxCredentialManager._is_credential_key() matches broad fragments such as token, secret, auth, and dsn. So non-secret settings like MAX_TOKENS, TOKENIZER_CACHE, or SECRET_SOURCE_DIR can disappear before container creation, and this path gives the caller no visibility into what was dropped. Please tighten the matcher or restrict stripping to an explicit credential allowlist before sanitizing at this layer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/tools/sandbox/docker_sandbox.py` around lines 239 - 245, The
sanitize_env call is currently stripping non-sensitive keys because
SandboxCredentialManager._is_credential_key matches broad substrings; update the
sanitization to only remove explicit credential keys instead of fuzzy substring
matches: add an explicit credential allowlist (e.g., CREDENTIAL_KEYS) inside
SandboxCredentialManager and change _is_credential_key to check exact key
equality or canonicalized exact prefixes from that allowlist, and/or add an
alternate method sanitize_env_allowlist that accepts the allowlist; then call
that stricter sanitizer where docker_sandbox.py invokes sanitize_env before
_validate_env so ordinary vars like MAX_TOKENS or TOKENIZER_CACHE are preserved
and callers retain visibility into what was dropped.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/persistence/sqlite/schema.sql`:
- Around line 578-581: Add a single-column index on ssrf_violations.hostname to
speed hostname-based allowlist/violation lookups: create an index (e.g.
idx_sv_hostname) with IF NOT EXISTS on the ssrf_violations(hostname) column;
optionally include port (hostname, port) if lookups commonly use both fields.
Update the schema by adding the CREATE INDEX IF NOT EXISTS idx_sv_hostname ON
ssrf_violations(hostname); (or ON ssrf_violations(hostname, port) as
appropriate) alongside the existing idx_sv_status_timestamp and idx_sv_timestamp
entries.

In `@src/synthorg/tools/sandbox/docker_sandbox.py`:
- Around line 126-127: The constructor leaves _runtime_resolver unset so
_resolve_runtime currently ignores self._config.runtime_overrides; either
initialize a default SandboxRuntimeResolver from self._config in __init__ or
update _resolve_runtime to fallback to self._config.runtime_overrides when
_runtime_resolver is None; specifically modify __init__ (where _runtime_resolver
is declared) to build a resolver (e.g.,
SandboxRuntimeResolver(self._config.runtime_overrides) or equivalent) or alter
_resolve_runtime to check if self._runtime_resolver is None and then apply
self._config.runtime_overrides directly before returning the resolved runtime,
and keep set_runtime_resolver() behavior unchanged.

In `@tests/unit/security/test_ssrf_violation.py`:
- Around line 92-101: Add a new unit test mirroring
test_allowed_without_resolved_by_rejected but using SsrfViolationStatus.DENIED
to verify the same validation rule; specifically, create a method named
test_denied_without_resolved_by_rejected that constructs SsrfViolation with
status=SsrfViolationStatus.DENIED (omitting resolved_by) and asserts
pytest.raises(ValidationError, match="resolved_by"), placing it alongside
test_allowed_without_resolved_by_rejected to ensure symmetric coverage for
SsrfViolation validation.
- Around line 80-90: Add a new unit test mirroring
test_pending_with_resolved_by_rejected that asserts a ValidationError when a
SsrfViolation is created with status SsrfViolationStatus.PENDING but a non-None
resolved_at; implement a method named test_pending_with_resolved_at_rejected
which calls SsrfViolation(id="v-bad", timestamp=_NOW,
url="http://example.com:80", hostname="example.com", port=80,
status=SsrfViolationStatus.PENDING, resolved_at=_NOW + timedelta(minutes=5))
inside a pytest.raises(ValidationError, match="resolved_at") context to validate
the rule.
- Around line 56-68: The test_denied_violation currently only asserts v.status;
add assertions to verify the resolution fields on the SsrfViolation instance by
asserting that v.resolved_by == "admin-1" and that v.resolved_at equals the
expected datetime (_NOW + timedelta(minutes=5)) so the test validates both
resolver and resolution timestamp.
- Around line 42-55: The test_allowed_violation test creates an SsrfViolation
with resolved_at set but doesn't assert it; update the test_allowed_violation
function to also assert that v.resolved_at equals the expected timestamp (the
_NOW + timedelta(minutes=5) value used when constructing the SsrfViolation) so
the test verifies both resolved_by and resolved_at on the SsrfViolation
instance.

---

Duplicate comments:
In `@src/synthorg/tools/sandbox/docker_sandbox.py`:
- Around line 239-245: The sanitize_env call is currently stripping
non-sensitive keys because SandboxCredentialManager._is_credential_key matches
broad substrings; update the sanitization to only remove explicit credential
keys instead of fuzzy substring matches: add an explicit credential allowlist
(e.g., CREDENTIAL_KEYS) inside SandboxCredentialManager and change
_is_credential_key to check exact key equality or canonicalized exact prefixes
from that allowlist, and/or add an alternate method sanitize_env_allowlist that
accepts the allowlist; then call that stricter sanitizer where docker_sandbox.py
invokes sanitize_env before _validate_env so ordinary vars like MAX_TOKENS or
TOKENIZER_CACHE are preserved and callers retain visibility into what was
dropped.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f9e8af96-f888-4096-9cba-13acbb832bb0

📥 Commits

Reviewing files that changed from the base of the PR and between d38ce3d and af3a140.

📒 Files selected for processing (5)
  • src/synthorg/persistence/sqlite/schema.sql
  • src/synthorg/tools/sandbox/docker_sandbox.py
  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/tools/sandbox/test_credential_manager.py
  • tests/unit/tools/sandbox/test_sandbox_factory.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Build Sandbox
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations -- Python 3.14 has PEP 649
Use except A, B: syntax without parentheses -- ruff enforces this on Python 3.14
Line length: 88 characters (ruff)

Files:

  • tests/unit/tools/sandbox/test_sandbox_factory.py
  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/tools/sandbox/test_credential_manager.py
  • src/synthorg/tools/sandbox/docker_sandbox.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Async: asyncio_mode = 'auto' -- no manual @pytest.mark.asyncio needed
Timeout: 30 seconds per test (global in pyproject.toml -- do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed)
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases
Property-based testing with Hypothesis: use @given + @settings. Profiles in tests/conftest.py: ci (deterministic, max_examples=10 + derandomize=True), dev (1000 examples), fuzz (10,000 examples, no deadline), extreme (500,000 examples, no deadline). Controlled via HYPOTHESIS_PROFILE env var
Failing Hypothesis examples are persisted to ~/.synthorg/hypothesis-examples/ (write-only shared DB) via _WriteOnlyDatabase in tests/conftest.py
Flaky tests: NEVER skip, dismiss, or ignore -- always fix them fully. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic. For blocking tasks, use asyncio.Event().wait() instead of asyncio.sleep(large_number)

Files:

  • tests/unit/tools/sandbox/test_sandbox_factory.py
  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/tools/sandbox/test_credential_manager.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/tools/sandbox/test_sandbox_factory.py
  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/tools/sandbox/test_credential_manager.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)
Immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence)
Config vs runtime state: frozen Pydantic models for config/identity; separate mutable-via-copy models for runtime state that evolves. Never mix static config fields with mutable runtime fields
Pydantic v2: use BaseModel, model_validator, computed_field, ConfigDict. Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields -- including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants -- instead of manual whitespace validators
Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task
Functions: < 50 lines, files < 800 lines
Errors: handle explicitly, never silently swallow
Validate: at system boundaries (user input, external APIs, config files)
Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(name)
Variable name for logger: always logger (not _logger, not log)
Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api)
Structured kwargs in logging: alway...

Files:

  • src/synthorg/tools/sandbox/docker_sandbox.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/tools/sandbox/docker_sandbox.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Never use import logging / logging.getLogger() / print() in application code (exception: observability/setup.py, observability/sinks.py, observability/syslog_handler.py, and observability/http_handler.py)

Files:

  • src/synthorg/tools/sandbox/docker_sandbox.py
🧠 Learnings (16)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T21:32:02.880Z
Learning: Applies to .github/workflows/docker.yml : Docker workflow: builds backend + web + sandbox images, pushes to GHCR, signs with cosign. SLSA L3 provenance attestations via actions/attest-build-provenance. Scans: Trivy (CRITICAL = hard fail, HIGH = warn) + Grype (critical cutoff) + CIS Docker Benchmark v1.6.0 compliance (informational). CVE triage via .github/.trivyignore.yaml and .github/.grype.yaml. Images only pushed after scans pass. Triggers on push to main and version tags (v*).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-06T16:35:12.934Z
Learning: Agents in the Parallel tier (Haiku-class) include 14 reviewers: async-concurrency-reviewer, comment-analyzer, design-token-audit, docs-consistency, go-conventions-enforcer, go-security-reviewer, infra-reviewer, issue-resolution-verifier, persistence-reviewer, pr-test-analyzer, security-reviewer, silent-failure-hunter, test-quality-reviewer, and type-design-analyzer
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T10:18:35.230Z
Learning: After the PR exists, use /aurelio-review-pr to handle external reviewer feedback
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T21:49:53.264Z
Learning: Fix everything valid — never skip when review agents find valid issues (including pre-existing issues in surrounding code, suggestions, and findings adjacent to the PR's changes). No deferring, no 'out of scope' skipping.
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docker/Dockerfile.sandbox : Docker sandbox: `synthorg-sandbox` — Python 3.14 + Node.js + git, non-root (UID 10001), agent code execution sandbox

Applied to files:

  • tests/unit/tools/sandbox/test_sandbox_factory.py
  • src/synthorg/tools/sandbox/docker_sandbox.py
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to tests/**/*.py : Tests must use `test-provider`, `test-small-001`, etc. instead of real vendor names

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/tools/sandbox/test_credential_manager.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to {src,tests,web,cli,site}/**/*.{py,ts,tsx,go,astro} : Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001. Vendor names may only appear in: (1) Operations design page provider list (docs/design/operations.md), (2) .claude/ skill/agent files, (3) third-party import paths/module names.

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/tools/sandbox/test_credential_manager.py
📚 Learning: 2026-04-07T10:18:35.230Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T10:18:35.230Z
Learning: Applies to src/**/*.py : Vendor-agnostic everywhere: NEVER use real vendor names in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small. Vendor names only in: Operations design page, .claude/ files, third-party imports, provider presets (src/synthorg/providers/presets.py)

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/tools/sandbox/test_credential_manager.py
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to {src,tests}/**/*.py : Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: `example-provider`, `example-large-001`, `example-medium-001`, `example-small-001`, `large`/`medium`/`small` as aliases

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/tools/sandbox/test_credential_manager.py
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to {src/synthorg/**/*.py,tests/**/*.py,web/src/**/*.{ts,tsx}} : NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names: `example-provider`, `example-large-001`, `example-medium-001`, `example-small-001`, `large`/`medium`/`small`

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/tools/sandbox/test_credential_manager.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Tests must use test-provider, test-small-001, etc. for vendor-agnostic test data.

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/tools/sandbox/test_credential_manager.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to web/**/*.vue : Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in dashboard code — use generic names: `example-provider`, `example-large-001`, etc.

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
  • tests/unit/tools/sandbox/test_credential_manager.py
📚 Learning: 2026-03-31T20:07:03.035Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T20:07:03.035Z
Learning: Applies to {docs/design/operations.md,src/synthorg/providers/presets.py,.claude/**/*.{md,yml,yaml}} : Vendor names may appear only in: (1) Operations design page, (2) `.claude/` skill/agent files, (3) third-party import paths, (4) provider presets (`src/synthorg/providers/presets.py`), (5) tests using `test-provider`, `test-small-001`, etc.

Applied to files:

  • tests/unit/security/test_ssrf_violation.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docker/Dockerfile : Docker: 3-stage build (builder → setup → distroless runtime) for backend, Chainguard Python, non-root (UID 65532), CIS-hardened

Applied to files:

  • src/synthorg/tools/sandbox/docker_sandbox.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.

Applied to files:

  • src/synthorg/tools/sandbox/docker_sandbox.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO level

Applied to files:

  • src/synthorg/tools/sandbox/docker_sandbox.py
📚 Learning: 2026-04-07T10:18:35.230Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-07T10:18:35.230Z
Learning: Applies to src/**/*.py : All state transitions must log at INFO

Applied to files:

  • src/synthorg/tools/sandbox/docker_sandbox.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to src/**/*.py : All state transitions must log at INFO level

Applied to files:

  • src/synthorg/tools/sandbox/docker_sandbox.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions.

Applied to files:

  • src/synthorg/tools/sandbox/docker_sandbox.py
🔇 Additional comments (7)
src/synthorg/persistence/sqlite/schema.sql (2)

530-553: LGTM! Previous review concerns addressed.

The risk_overrides table properly implements:

  • Paired nullability CHECK constraint for revoked_at/revoked_by (lines 542-546)
  • Optimized partial index with corrected column order (created_at DESC, expires_at) for list_active() queries
  • Schema column order matches the repository code exactly

555-581: LGTM! Previous review concerns addressed.

The ssrf_violations table properly implements:

  • Port range validation (BETWEEN 1 AND 65535)
  • Status enum constraint with consistent resolution field requirements (lines 569-575)
  • Appropriate composite index for status+timestamp queries
tests/unit/security/test_ssrf_violation.py (5)

1-13: LGTM!

Imports and module setup are correct. The use of UTC timezone for datetime constants is a good practice.


15-21: LGTM!

Enum value tests are straightforward and correct.


69-78: LGTM!

Immutability test correctly validates frozen model behavior using pytest.raises(ValidationError).


103-121: LGTM!

Port boundary tests correctly validate that port 0 is rejected and port 65535 (maximum valid) is accepted.


123-148: LGTM!

Tests for optional field defaults and serialization roundtrip are well-structured. The use of vendor-agnostic "test-provider" (line 144) correctly follows project guidelines.

Comment on lines +126 to +127
self._credential_manager = SandboxCredentialManager()
self._runtime_resolver: SandboxRuntimeResolver | None = None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

runtime_overrides are inert unless another setup step happens.

__init__() always leaves _runtime_resolver unset, and _resolve_runtime() then ignores self._config.runtime_overrides entirely. That makes the new per-category hardening ineffective for direct DockerSandbox(config=...) users and any construction path that forgets to call set_runtime_resolver() later.

🛠 Minimal fallback if you do not want to build the resolver in __init__
     def _resolve_runtime(self, category: str) -> str | None:
         if self._runtime_resolver is not None:
             return self._runtime_resolver.resolve_runtime(category)
+        if category:
+            override = self._config.runtime_overrides.get(category)
+            if override is not None:
+                return override
         return self._config.runtime

Also applies to: 313-321

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/tools/sandbox/docker_sandbox.py` around lines 126 - 127, The
constructor leaves _runtime_resolver unset so _resolve_runtime currently ignores
self._config.runtime_overrides; either initialize a default
SandboxRuntimeResolver from self._config in __init__ or update _resolve_runtime
to fallback to self._config.runtime_overrides when _runtime_resolver is None;
specifically modify __init__ (where _runtime_resolver is declared) to build a
resolver (e.g., SandboxRuntimeResolver(self._config.runtime_overrides) or
equivalent) or alter _resolve_runtime to check if self._runtime_resolver is None
and then apply self._config.runtime_overrides directly before returning the
resolved runtime, and keep set_runtime_resolver() behavior unchanged.

@Aureliolo Aureliolo temporarily deployed to cloudflare-preview April 7, 2026 11:19 — with GitHub Actions Inactive
@Aureliolo Aureliolo merged commit 31e7273 into main Apr 7, 2026
33 of 35 checks passed
@Aureliolo Aureliolo deleted the feat/security-hardening branch April 7, 2026 13:18
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview April 7, 2026 13:18 — with GitHub Actions Inactive
Aureliolo added a commit that referenced this pull request Apr 7, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.6.4](v0.6.3...v0.6.4)
(2026-04-07)


### Features

* analytics and metrics runtime pipeline
([#226](#226),
[#225](#225),
[#227](#227),
[#224](#224))
([#1127](#1127))
([ec57641](ec57641))
* engine intelligence -- quality signals, health monitoring, trajectory
scoring, coordination metrics
([#1099](#1099))
([aac2029](aac2029)),
closes [#697](#697)
[#707](#707)
[#705](#705)
[#703](#703)
* enterprise-grade auth -- HttpOnly cookie sessions, CSRF, lockout,
session limits
([#1102](#1102))
([d3022c7](d3022c7)),
closes [#1068](#1068)
* implement core tool categories and granular sub-constraints
([#1101](#1101))
([0611b53](0611b53)),
closes [#1034](#1034)
[#220](#220)
* memory evolution -- GraphRAG/consistency research +
SelfEditingMemoryStrategy
([#1036](#1036),
[#208](#208))
([#1129](#1129))
([a9acda3](a9acda3))
* security hardening -- sandbox, risk override, SSRF self-heal, DAST fix
([#1100](#1100))
([31e7273](31e7273)),
closes [#1098](#1098)
[#696](#696)
[#222](#222)
[#671](#671)


### Bug Fixes

* harden agent identity versioning post-review
([#1128](#1128))
([8eb2859](8eb2859)),
closes [#1076](#1076)


### Documentation

* engine architecture research
([#688](#688)
[#690](#690)
[#848](#848)
[#687](#687))
([#1114](#1114))
([59b31f9](59b31f9))


### Maintenance

* add .claudeignore and split CLAUDE.md for token optimization
([#1112](#1112))
([b0fbd18](b0fbd18))
* bump github.com/sigstore/protobuf-specs from 0.5.0 to 0.5.1 in /cli in
the all group
([#1106](#1106))
([73089c9](73089c9))
* bump jsdom from 29.0.1 to 29.0.2 in /site in the all group
([#1107](#1107))
([8e99dce](8e99dce))
* bump jsdom from 29.0.1 to 29.0.2 in /web in the all group
([#1108](#1108))
([ce8c749](ce8c749))
* bump python from `fb83750` to `6869258` in /docker/backend in the all
group ([#1104](#1104))
([4911726](4911726))
* bump python from `fb83750` to `6869258` in /docker/web in the all
group ([#1103](#1103))
([87bdf09](87bdf09))
* bump the all group across 1 directory with 4 updates
([#1111](#1111))
([f702464](f702464))
* bump the all group in /docker/sandbox with 2 updates
([#1105](#1105))
([05a91ca](05a91ca))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants