feat: add pluggable output scan response policies by Aureliolo · Pull Request #280 · Aureliolo/synthorg

Aureliolo · 2026-03-10T21:53:19Z

Summary

Add pluggable output scan response policies (OutputScanResponsePolicy protocol) with 4 built-in strategies: RedactPolicy, WithholdPolicy, LogOnlyPolicy, AutonomyTieredPolicy
Add OutputScanPolicyType config enum and build_output_scan_policy() factory for config-driven policy selection
Integrate policy into SecOpsService.scan_output() with proper error handling (fail-safe to raw scan result)
Simplify Docker CI scan configuration (merge Trivy CRITICAL+HIGH into single step, raise Grype cutoff to HIGH, remove CVE ignore files)
Comprehensive test coverage: policy unit tests, factory tests, service integration tests, invoker gap tests (6691 tests pass, 94.83% coverage)

Output Scan Policies

Policy	Behavior	Default Autonomy Level
Redact (default)	Pass through scanner's redaction	SEMI, SUPERVISED
Withhold	Clear redacted content — fail-closed	LOCKED
Log-only	Log findings at WARNING, pass original output	FULL
Autonomy-tiered	Delegate to sub-policy based on autonomy level	Composite

Pre-PR Review Hardening (17 findings from 10 agents)

Error handling around policy.apply() with MemoryError/RecursionError re-raise
LogOnlyPolicy upgraded from DEBUG to WARNING when sensitive data detected
AutonomyTieredPolicy warns on unmapped autonomy level fallback
_DEFAULT_AUTONOMY_POLICY_MAP wrapped in MappingProxyType (immutability)
Factory function wires OutputScanPolicyType config enum to runtime policy instances
Docstring improvements (WithholdPolicy findings preservation, LogOnlyPolicy clarity, SecurityConfig attributes)
DESIGN_SPEC.md §12.3 and §15.3 updated with policy documentation

Test plan

All 4 policies tested for sensitive and clean inputs
Protocol compliance verified via isinstance checks
AutonomyTieredPolicy tested for all 4 autonomy levels + None fallback + custom map + missing key fallback
Factory tested for all enum members + autonomy integration
Service integration: policy applied after scan, policy transforms result, no-policy passthrough, policy failure returns raw result, MemoryError propagates
Config: OutputScanPolicyType defaults, all members accepted, invalid rejected, JSON roundtrip
Invoker gaps: non-recoverable errors propagate, tool errors skip scan, content passing, invoke_all scanning, soft error scanning
Audit failure isolation preserved
6691 passed, 9 skipped, 94.83% coverage

Closes #263

🤖 Generated with Claude Code

github-actions · 2026-03-10T21:53:28Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

coderabbitai · 2026-03-10T21:53:41Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1405b851-bfd4-4fb6-b7f1-a6e4e04cec42

📥 Commits

Reviewing files that changed from the base of the PR and between 4a8faff and 2e3a3aa.

📒 Files selected for processing (6)

README.md
src/ai_company/security/config.py
src/ai_company/security/output_scan_policy.py
tests/unit/security/test_config.py
tests/unit/security/test_output_scan_policy.py
tests/unit/security/test_service.py

📝 Walkthrough

Summary by CodeRabbit

New Features
- Configurable post-scan response system with four policies: redact (default), withhold, log-only, and autonomy‑tiered; service applies the selected policy after output scanning.
- Added an observability event for when a policy is applied and config-based policy selection.
Documentation
- Updated design docs and README to describe output-scan policy options and configuration.
Tests
- Added comprehensive unit tests for policies, factory selection, service integration, and tool-invoker scenarios.

Walkthrough

Adds a pluggable post-output handling layer: new OutputScanPolicyType config enum and SecurityConfig field, a Policy protocol with four concrete policies and default autonomy→policy map, a factory to build policies, SecOpsService wiring to apply policies after scanning/audit (with observability), and accompanying unit tests and docs updates.

Changes

Cohort / File(s)	Summary
Config & docs `src/ai_company/security/config.py`, `CLAUDE.md`, `DESIGN_SPEC.md`, `README.md`	Add `OutputScanPolicyType` enum and `SecurityConfig.output_scan_policy_type` (default AUTONOMY_TIERED per design); update docs to list "output scan response policies".
Policy implementations `src/ai_company/security/output_scan_policy.py`	Add `OutputScanResponsePolicy` Protocol and four implementations: `RedactPolicy`, `WithholdPolicy`, `LogOnlyPolicy`, `AutonomyTieredPolicy`; include `_DEFAULT_AUTONOMY_POLICY_MAP` and emit `SECURITY_OUTPUT_SCAN_POLICY_APPLIED` when applied.
Policy factory `src/ai_company/security/output_scan_policy_factory.py`	Add `build_output_scan_policy()` to construct policy instances from `OutputScanPolicyType`; handle AUTONOMY_TIERED with optional `effective_autonomy`, log on missing autonomy, and error on unknown types.
Service integration `src/ai_company/security/service.py`	SecOpsService gains optional `output_scan_policy` parameter (or builds one from config) and applies the policy in `scan_output()` after scanning and audit; logs and returns raw redacted result on policy apply failure.
Public API surface `src/ai_company/security/__init__.py`	Expose `OutputScanPolicyType`, `OutputScanResponsePolicy`, `RedactPolicy`, `WithholdPolicy`, `LogOnlyPolicy`, `AutonomyTieredPolicy`, and `build_output_scan_policy` on the security package surface.
Observability `src/ai_company/observability/events/security.py`	Add event constant `SECURITY_OUTPUT_SCAN_POLICY_APPLIED`.
Tests — policies & factory `tests/unit/security/test_output_scan_policy.py`, `tests/unit/security/test_output_scan_policy_factory.py`	Comprehensive tests for policy behavior, delegation, custom maps, factory instantiation, and protocol compliance.
Tests — config `tests/unit/security/test_config.py`	Tests for `OutputScanPolicyType` integration in `SecurityConfig`, defaults, JSON round-trip, and validation.
Tests — service & tools integration `tests/unit/security/test_service.py`, `tests/unit/tools/test_invoker_security.py`	Tests covering policy application in `scan_output()`, audit interaction and error handling, and ToolInvoker output-scanning flows (soft errors, multiple calls, denial/escalation).

Sequence Diagram(s)

sequenceDiagram
    participant Tool as Tool
    participant SecOps as SecOpsService
    participant Scanner as OutputScanner
    participant Audit as AuditLog
    participant Policy as OutputScanPolicy
    participant Context as SecurityContext

    Tool->>SecOps: return output content
    SecOps->>Scanner: scan(content)
    Scanner-->>SecOps: OutputScanResult
    SecOps->>Audit: record(scan_result)
    Audit-->>SecOps: recorded
    alt output_scan_policy configured
        SecOps->>Policy: apply(scan_result, Context)
        Policy-->>SecOps: transformed OutputScanResult
        SecOps->>Context: emit SECURITY_OUTPUT_SCAN_POLICY_APPLIED
    else no policy or policy fails
        Note over SecOps: return raw (redacted) scan_result
    end
    SecOps-->>Tool: sanitized result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat: add pluggable output scan response policies #280: Implements the same OutputScanResponsePolicy types, factory, and SecOps wiring — directly related and likely overlapping changes.
feat: add SecOps security agent with rule engine, audit log, and ToolInvoker integration #192: Extends the SecOps output-scanning path and adds policy/factory wiring and observability — strongly related to these edits.
#263: Design issue calling for integrating OutputScanner into ToolInvoker execution flow — functionally relevant to the ToolInvoker tests and scan integration in this PR.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 57.53% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately and concisely summarizes the main change: adding pluggable output scan response policies. It directly reflects the primary feature introduced in the changeset.
Description check	✅ Passed	The description is comprehensive and directly related to the changeset. It outlines the policies, configuration, integration, test coverage, and links to the related issue, all of which are reflected in the code changes.
Linked Issues check	✅ Passed	The PR successfully implements the core objectives from issue `#263`: OutputScanner integration into ToolInvoker, configurable response policies per autonomy level, and post-tool scanning with redaction.
Out of Scope Changes check	✅ Passed	All code changes are within scope of issue `#263` and the PR objectives. Minor documentation updates and CI improvements are supportive of the main feature and not out-of-scope.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/output-scanner-integration

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch feat/output-scanner-integration

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-10T21:53:54Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the security framework by introducing a pluggable policy system for handling the results of output scans. It allows the system to respond to detected sensitive data in a configurable manner, such as redacting, withholding, or merely logging the findings, with behavior potentially varying based on the agent's autonomy level. This provides greater control and flexibility in managing sensitive information, improving the overall security posture without disrupting core operations.

Highlights

Pluggable Output Scan Response Policies: Introduced a new OutputScanResponsePolicy protocol with four built-in strategies: RedactPolicy, WithholdPolicy, LogOnlyPolicy, and AutonomyTieredPolicy.
Config-Driven Policy Selection: Added an OutputScanPolicyType configuration enum and a build_output_scan_policy() factory function for selecting policies based on configuration.
SecOpsService Integration: Integrated the new policy system into SecOpsService.scan_output(), ensuring proper error handling that defaults to the raw scan result in case of policy application failure.
Documentation and Hardening: Updated DESIGN_SPEC.md and CLAUDE.md with policy documentation, and addressed pre-PR review findings including error handling, logging levels, and immutability of default policy maps.
Comprehensive Test Coverage: Ensured extensive test coverage for policy units, factory functions, service integration, and invoker gap scenarios, with 6691 tests passing and 94.83% coverage.

Changelog

CLAUDE.md
- Updated the security section to include output scan response policies.
DESIGN_SPEC.md
- Updated the implementation snapshot to reflect the completion of output scan response policies.
- Added a new section detailing the functionality and behavior of output scan response policies.
src/ai_company/observability/events/security.py
- Added a new event constant, SECURITY_OUTPUT_SCAN_POLICY_APPLIED, for logging policy application.
src/ai_company/security/init.py
- Imported and exposed new security configuration and policy-related modules and classes.
src/ai_company/security/config.py
- Introduced the OutputScanPolicyType enum for declarative policy selection.
- Added output_scan_policy_type field to SecurityConfig to configure the active policy.
src/ai_company/security/output_scan_policy.py
- Added a new file defining the OutputScanResponsePolicy protocol.
- Implemented RedactPolicy, WithholdPolicy, LogOnlyPolicy, and AutonomyTieredPolicy as concrete policy strategies.
src/ai_company/security/output_scan_policy_factory.py
- Added a new file containing the build_output_scan_policy factory function for creating policy instances.
src/ai_company/security/service.py
- Modified the SecOpsService to accept and apply an OutputScanResponsePolicy after scanning tool output.
- Implemented robust error handling for policy application, ensuring a fail-safe return of raw scan results.
tests/unit/security/test_config.py
- Added tests for the OutputScanPolicyType enum, including acceptance in SecurityConfig and JSON roundtrip validation.
tests/unit/security/test_output_scan_policy.py
- Added a new file with comprehensive unit tests for RedactPolicy, WithholdPolicy, LogOnlyPolicy, and AutonomyTieredPolicy.
tests/unit/security/test_output_scan_policy_factory.py
- Added a new file with unit tests for the build_output_scan_policy factory function.
tests/unit/security/test_service.py
- Extended tests for SecOpsService to cover audit failure isolation and the integration of output scan policies.
- Added tests for policy application, result transformation, and error propagation scenarios.
tests/unit/tools/test_invoker_security.py
- Added new test cases for invoker gaps, including non-recoverable errors during scanning, skipping scans on tool execution errors, content passing to scan_output, invoke_all output scanning, and scanning of soft error content.

Activity

Pre-PR review hardening addressed 17 findings from 10 agents, improving error handling, logging, and immutability.
Error handling for MemoryError and RecursionError during policy application was implemented to re-raise these critical exceptions.
LogOnlyPolicy was updated to log findings at a WARNING level when sensitive data is detected.
AutonomyTieredPolicy now includes warnings for unmapped autonomy levels, falling back gracefully.
The default autonomy-to-policy map (_DEFAULT_AUTONOMY_POLICY_MAP) was wrapped in MappingProxyType for immutability.
Comprehensive test coverage was added, including unit tests for all four policies, factory functions, and service integration, ensuring correct behavior across various scenarios.
Invoker gap tests were added to verify behavior in edge cases such as non-recoverable errors, tool execution failures, and invoke_all scenarios.
All 6691 tests passed, with 9 skipped, achieving 94.83% code coverage.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

greptile-apps · 2026-03-10T21:56:39Z

Greptile Summary

This PR introduces a pluggable output scan response policy system for the SecOpsService, adding a OutputScanResponsePolicy protocol with four built-in strategies (RedactPolicy, WithholdPolicy, LogOnlyPolicy, AutonomyTieredPolicy), a config enum (OutputScanPolicyType), and a factory function (build_output_scan_policy). The policies slot cleanly into SecOpsService.scan_output() via constructor injection or config-driven defaults, and the design is well-aligned with the codebase's protocol-first, fail-closed philosophy.

Key highlights and findings:

Critical syntax error: The newly added except MemoryError, RecursionError: at service.py:280 uses Python 2 exception-binding syntax that was explicitly removed in Python 3 (PEP 3110). This is a SyntaxError that prevents service.py from importing. The same pre-existing pattern appears at lines 179, 263, 381, and 433 and should be corrected to except (MemoryError, RecursionError): throughout.
The output_scan_policy.py implementation is clean and well-reasoned: MappingProxyType immutability, shallow-copy decoupling for custom policy_map, import-time integrity check with a companion CI test, and a WARNING-level log for LogOnlyPolicy when sensitive data is detected.
Previous review findings (event constant semantics, deepcopy removal, AutonomyTieredPolicy fallback warning) have all been addressed.
Test coverage is thorough: protocol compliance, all four autonomy levels, custom map fallback, service integration, and error-propagation cases.

Confidence Score: 1/5

Not safe to merge — service.py contains a Python 3 SyntaxError that will prevent the module from loading.
The new except MemoryError, RecursionError: at line 280 of service.py (and identical pre-existing patterns at lines 179, 263, 381, 433) uses the Python 2 comma-binding syntax removed by PEP 3110. In Python 3 this is a SyntaxError, making service.py unimportable. Everything else — the policy protocol, implementations, factory, config, and tests — is well-structured and correct. The score would be 4–5 once the syntax is fixed to except (MemoryError, RecursionError):.
src/ai_company/security/service.py — all five except MemoryError, RecursionError: occurrences must be parenthesised.

Important Files Changed

Filename	Overview
src/ai_company/security/output_scan_policy.py	Introduces the `OutputScanResponsePolicy` protocol and four concrete implementations (RedactPolicy, WithholdPolicy, LogOnlyPolicy, AutonomyTieredPolicy), along with the `_DEFAULT_AUTONOMY_POLICY_MAP` with an import-time integrity check. Implementation is clean, well-documented, and the previous review comments (deepcopy removal, MappingProxyType immutability) have been addressed.
src/ai_company/security/service.py	Integrates the output scan policy into `scan_output()` and adds `output_scan_policy` constructor injection. Contains a critical Python 3 SyntaxError: `except MemoryError, RecursionError:` (Python 2 syntax, removed in PEP 3110) at line 280 — the new addition — plus pre-existing identical syntax errors at lines 179, 263, 381, and 433. The module will fail to import in Python 3.
src/ai_company/security/output_scan_policy_factory.py	Clean factory function mapping `OutputScanPolicyType` enum values to policy instances via `match`. Previous review concern (wrong event constant) was already addressed — `SECURITY_CONFIG_LOADED` is now correctly used for the factory-time warning.
src/ai_company/security/config.py	Adds `OutputScanPolicyType` StrEnum and `output_scan_policy_type` field to `SecurityConfig` with a sensible default of `AUTONOMY_TIERED`. Well-structured with clear docstrings.
src/ai_company/security/init.py	Correctly exports all new public symbols (`OutputScanPolicyType`, all four policy classes, `build_output_scan_policy`) and updates `__all__`.
tests/unit/security/test_output_scan_policy.py	Comprehensive policy unit tests covering all four strategies, protocol compliance via `isinstance`, all autonomy level combinations, custom map overrides, and the `_DEFAULT_AUTONOMY_POLICY_MAP` integrity assertion.
tests/unit/security/test_service.py	New `TestSecOpsScanOutputPolicy` class tests policy application, transformation, no-policy passthrough, policy failure fallback, and `MemoryError`/`RecursionError` propagation. Coverage looks solid.
src/ai_company/observability/events/security.py	Adds `SECURITY_OUTPUT_SCAN_POLICY_APPLIED` event constant. Clean addition consistent with the existing constant style.

Sequence Diagram

sequenceDiagram
    participant Invoker
    participant SecOpsService
    participant OutputScanner
    participant AuditLog
    participant OutputScanResponsePolicy

    Invoker->>SecOpsService: scan_output(context, output)
    SecOpsService->>OutputScanner: scan(output)
    OutputScanner-->>SecOpsService: OutputScanResult
    alt has_sensitive_data AND audit_enabled
        SecOpsService->>AuditLog: record(AuditEntry)
    end
    SecOpsService->>OutputScanResponsePolicy: apply(scan_result, context)
    note over OutputScanResponsePolicy: RedactPolicy → pass-through<br/>WithholdPolicy → clear redacted_content<br/>LogOnlyPolicy → return OutputScanResult()<br/>AutonomyTieredPolicy → delegate by level
    OutputScanResponsePolicy-->>SecOpsService: transformed OutputScanResult
    alt policy.apply() raises (non MemoryError/RecursionError)
        SecOpsService-->>Invoker: raw scan_result (fail-safe)
    else success
        SecOpsService-->>Invoker: transformed OutputScanResult
    end

_{Last reviewed commit: 2e3a3aa}

src/ai_company/security/output_scan_policy_factory.py

Copilot

Pull request overview

Introduces pluggable output scan response policies to control how sensitive-output findings are handled after OutputScanner runs, with supporting config surface and expanded unit/service tests. This extends the security subsystem’s post-tool output scanning so it can be tuned by strategy (redact/withhold/log-only/autonomy-tiered) rather than being hard-coded.

Changes:

Add OutputScanResponsePolicy protocol + built-in policy implementations and a factory (build_output_scan_policy) for enum-driven selection.
Integrate an optional output_scan_policy into SecOpsService.scan_output() (policy applied after audit recording; failures fall back to raw scan result).
Add/extend unit tests covering policy behavior, factory behavior, and invoker/service integration gaps; update docs/events/exports accordingly.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/ai_company/security/output_scan_policy.py`	Adds policy protocol + concrete policy implementations (redact/withhold/log-only/autonomy-tiered).
`src/ai_company/security/output_scan_policy_factory.py`	Adds config-enum-to-policy factory for constructing policy instances.
`src/ai_company/security/service.py`	Applies an optional response policy after scanning/auditing in `scan_output()`.
`src/ai_company/security/config.py`	Adds `OutputScanPolicyType` enum + `SecurityConfig.output_scan_policy_type` default.
`src/ai_company/security/__init__.py`	Exposes new policy types/factory via package public API.
`src/ai_company/observability/events/security.py`	Adds a new event constant for policy-related logging.
`tests/unit/security/test_output_scan_policy.py`	New unit tests for each policy + protocol compliance checks.
`tests/unit/security/test_output_scan_policy_factory.py`	New tests validating factory outputs for each enum value.
`tests/unit/security/test_service.py`	Adds scan_output audit-failure isolation + policy integration tests.
`tests/unit/security/test_config.py`	Adds config default/roundtrip/validation coverage for `OutputScanPolicyType`.
`tests/unit/tools/test_invoker_security.py`	Adds invoker “gap tests” around output scanning behavior and error propagation.
`DESIGN_SPEC.md` / `CLAUDE.md`	Documents the new policy concept and updates implementation snapshot.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-10T22:01:05Z

src/ai_company/security/service.py

        approval_store: ApprovalStore | None = None,
        effective_autonomy: EffectiveAutonomy | None = None,
        risk_classifier: RiskTierClassifier | None = None,
+        output_scan_policy: OutputScanResponsePolicy | None = None,
    ) -> None:


SecurityConfig now exposes output_scan_policy_type, but the default SecOpsService wiring won’t use it unless callers explicitly pass output_scan_policy. This means changing config has no effect in the common path. Consider constructing a default policy in SecOpsService.__init__ when output_scan_policy is None (via build_output_scan_policy(config.output_scan_policy_type, effective_autonomy=effective_autonomy)), while keeping constructor injection as an override.

Copilot · 2026-03-10T22:01:05Z

src/ai_company/security/output_scan_policy_factory.py

+            Required when ``policy_type`` is ``AUTONOMY_TIERED``;
+            ignored otherwise.


The docstring says effective_autonomy is “Required when policy_type is AUTONOMY_TIERED”, but the implementation explicitly allows None (it logs a warning and falls back to RedactPolicy). Update the docstring to match the actual behavior, or enforce the requirement by raising a ValueError when policy_type is AUTONOMY_TIERED and effective_autonomy is None.

Suggested change

Required when ``policy_type`` is ``AUTONOMY_TIERED``;

ignored otherwise.

Used when ``policy_type`` is ``AUTONOMY_TIERED``. If ``None``

in that case, a warning is logged and the policy will fall

back to ``RedactPolicy``. Ignored for other policy types.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/ai_company/security/__init__.py (1)

1-14: 🧹 Nitpick | 🔵 Trivial

Update the module docstring to mention the new policy types.

The public API docstring lists OutputScanResult / OutputScanner but doesn't mention the newly exported policy types (OutputScanPolicyType, OutputScanResponsePolicy, policy implementations, build_output_scan_policy).

📝 Suggested docstring update

 - ``OutputScanResult`` / ``OutputScanner`` — post-tool output scanning.
+- ``OutputScanResponsePolicy`` — protocol for output scan policies.
+- ``RedactPolicy`` / ``WithholdPolicy`` / ``LogOnlyPolicy`` / ``AutonomyTieredPolicy`` — policy implementations.
+- ``OutputScanPolicyType`` / ``build_output_scan_policy`` — config-driven policy selection.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/security/__init__.py` around lines 1 - 14, Update the module
docstring at the top of src/ai_company/security/__init__.py to include the new
output-scan policy exports by adding mentions of OutputScanPolicyType,
OutputScanResponsePolicy, the policy implementations, and
build_output_scan_policy alongside the existing OutputScanResult / OutputScanner
entry; edit the public API list so it explicitly documents these symbols
(OutputScanPolicyType, OutputScanResponsePolicy, build_output_scan_policy, plus
any concrete policy class names) so they are discoverable in the module
docstring.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/security/output_scan_policy.py`:
- Around line 203-216: The constructor currently stores the caller-supplied
policy_map by reference, allowing external mutation; in the __init__ where
policy_map is handled (the policy_map parameter and self._policy_map
assignment), make a deep copy of the provided mapping (use copy.deepcopy) and
then wrap that copy with MappingProxyType before assigning to self._policy_map;
when policy_map is None still assign a MappingProxyType-wrapped copy of
_DEFAULT_AUTONOMY_POLICY_MAP so the stored map is always frozen and cannot be
mutated by callers.

In `@tests/unit/security/test_output_scan_policy_factory.py`:
- Around line 30-73: Add a unit test verifying the factory's defensive TypeError
for unknown types: inside TestBuildOutputScanPolicy add a method (e.g.
test_unknown_policy_type_raises_type_error) that calls build_output_scan_policy
with an invalid value (e.g. a string, using type: ignore) and asserts it raises
TypeError with a message containing "Unknown output scan policy type"; reference
build_output_scan_policy and the defensive branch in
output_scan_policy_factory.py to locate the behavior to test.

In `@tests/unit/tools/test_invoker_security.py`:
- Around line 624-656: Combine the two nearly identical tests
test_memory_error_from_scan_propagates and
test_recursion_error_from_scan_propagates into one parametrized test (e.g.,
test_nonrecoverable_scan_exceptions) using `@pytest.mark.parametrize` to iterate
over the exception instances or types (MemoryError("oom"), RecursionError("max
depth")) and their corresponding exception classes; in the single test set
interceptor = _make_interceptor(...), set interceptor.scan_output =
AsyncMock(side_effect=exception), create the ToolInvoker and assert with
pytest.raises(expected_exception) that await invoker.invoke(tool_call)
propagates the error. Ensure you reference and reuse the same setup
(security_registry, tool_call, _make_interceptor, ToolInvoker,
interceptor.scan_output) so adding more non-recoverable exceptions is simple.

---

Outside diff comments:
In `@src/ai_company/security/__init__.py`:
- Around line 1-14: Update the module docstring at the top of
src/ai_company/security/__init__.py to include the new output-scan policy
exports by adding mentions of OutputScanPolicyType, OutputScanResponsePolicy,
the policy implementations, and build_output_scan_policy alongside the existing
OutputScanResult / OutputScanner entry; edit the public API list so it
explicitly documents these symbols (OutputScanPolicyType,
OutputScanResponsePolicy, build_output_scan_policy, plus any concrete policy
class names) so they are discoverable in the module docstring.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f6a44960-3110-454c-a6b5-7bfa251531a0

📥 Commits

Reviewing files that changed from the base of the PR and between a488758 and 548439c.

📒 Files selected for processing (13)

CLAUDE.md
DESIGN_SPEC.md
src/ai_company/observability/events/security.py
src/ai_company/security/__init__.py
src/ai_company/security/config.py
src/ai_company/security/output_scan_policy.py
src/ai_company/security/output_scan_policy_factory.py
src/ai_company/security/service.py
tests/unit/security/test_config.py
tests/unit/security/test_output_scan_policy.py
tests/unit/security/test_output_scan_policy_factory.py
tests/unit/security/test_service.py
tests/unit/tools/test_invoker_security.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Agent
GitHub Check: Greptile Review

🧰 Additional context used

📓 Path-based instructions (4)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do NOT use from __future__ import annotations — Python 3.14 has PEP 649 with native lazy annotations
Use except A, B: syntax (no parentheses) for exception handling — PEP 758 style, enforced by ruff on Python 3.14
Add type hints to all public functions; enforce with mypy strict mode
Use Google-style docstrings, required on public classes and functions (enforced by ruff D rules)
Create new objects instead of mutating existing ones; for non-Pydantic internal collections use copy.deepcopy() at construction and wrap with MappingProxyType for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with model_copy(update=...) for runtime state that evolves
Use Pydantic v2 BaseModel, model_validator, computed_field, and ConfigDict; use @computed_field for derived values instead of storing redundant fields
Use NotBlankStr from core.types for all identifier/name fields (including optional NotBlankStr | None and tuple[NotBlankStr, ...] variants) instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls) over bare create_task
Maintain line length of 88 characters (enforced by ruff)
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly; never silently swallow errors
Validate at system boundaries: user input, external APIs, and config files
Every module with business logic MUST have: from ai_company.observability import get_logger then logger = get_logger(__name__)
Always use variable name logger (not _logger, not log) for logger instances
Always use event name constants from domain-specific mo...

Files:

src/ai_company/security/__init__.py
tests/unit/security/test_output_scan_policy.py
tests/unit/security/test_config.py
src/ai_company/security/output_scan_policy.py
src/ai_company/security/output_scan_policy_factory.py
tests/unit/security/test_service.py
src/ai_company/security/service.py
src/ai_company/observability/events/security.py
tests/unit/security/test_output_scan_policy_factory.py
src/ai_company/security/config.py
tests/unit/tools/test_invoker_security.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Never use import logging, logging.getLogger(), or print() in application code

Files:

src/ai_company/security/__init__.py
src/ai_company/security/output_scan_policy.py
src/ai_company/security/output_scan_policy_factory.py
src/ai_company/security/service.py
src/ai_company/observability/events/security.py
src/ai_company/security/config.py

{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Vendor names may only appear in: (1) DESIGN_SPEC.md provider list, (2) .claude/ skill/agent files, (3) third-party import paths/module names. Tests must use test-provider, test-small-001, etc.

Files:

src/ai_company/security/__init__.py
tests/unit/security/test_output_scan_policy.py
tests/unit/security/test_config.py
src/ai_company/security/output_scan_policy.py
src/ai_company/security/output_scan_policy_factory.py
tests/unit/security/test_service.py
src/ai_company/security/service.py
src/ai_company/observability/events/security.py
tests/unit/security/test_output_scan_policy_factory.py
src/ai_company/security/config.py
tests/unit/tools/test_invoker_security.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow for test categorization
Use asyncio_mode = 'auto' for async tests — no manual @pytest.mark.asyncio needed
Set test timeout to 30 seconds per test
Use pytest-xdist via -n auto for parallel test execution
Prefer @pytest.mark.parametrize for testing similar cases

Files:

tests/unit/security/test_output_scan_policy.py
tests/unit/security/test_config.py
tests/unit/security/test_service.py
tests/unit/security/test_output_scan_policy_factory.py
tests/unit/tools/test_invoker_security.py

🧬 Code graph analysis (9)

tests/unit/security/test_output_scan_policy.py (4)

src/ai_company/core/enums.py (2)

AutonomyLevel (459-469)

ToolCategory (294-308)

src/ai_company/security/autonomy/models.py (1)

EffectiveAutonomy (154-191)

src/ai_company/security/models.py (2)

OutputScanResult (152-177)

SecurityContext (70-100)

src/ai_company/security/output_scan_policy.py (10)

AutonomyTieredPolicy (193-267)

LogOnlyPolicy (129-177)

OutputScanResponsePolicy (29-55)

RedactPolicy (58-88)

WithholdPolicy (91-126)

apply (41-55)

apply (69-88)

apply (105-126)

apply (144-177)

apply (224-267)

tests/unit/security/test_config.py (1)

src/ai_company/security/config.py (2)

OutputScanPolicyType (17-34)

SecurityConfig (91-138)

src/ai_company/security/output_scan_policy.py (4)

src/ai_company/core/enums.py (1)

AutonomyLevel (459-469)

src/ai_company/observability/_logger.py (1)

get_logger (8-28)

src/ai_company/security/models.py (1)

OutputScanResult (152-177)

src/ai_company/security/autonomy/models.py (1)

EffectiveAutonomy (154-191)

src/ai_company/security/output_scan_policy_factory.py (3)

src/ai_company/security/config.py (1)

OutputScanPolicyType (17-34)

src/ai_company/security/output_scan_policy.py (4)

LogOnlyPolicy (129-177)

OutputScanResponsePolicy (29-55)

RedactPolicy (58-88)

WithholdPolicy (91-126)

src/ai_company/security/autonomy/models.py (1)

EffectiveAutonomy (154-191)

tests/unit/security/test_service.py (7)

src/ai_company/security/output_scan_policy.py (12)

WithholdPolicy (91-126)

OutputScanResponsePolicy (29-55)

apply (41-55)

apply (69-88)

apply (105-126)

apply (144-177)

apply (224-267)

name (37-39)

name (65-67)

name (101-103)

name (140-142)

name (220-222)

src/ai_company/security/output_scanner.py (2)

OutputScanner (31-70)

scan (34-70)

src/ai_company/security/rules/engine.py (2)

RuleEngine (32-174)

evaluate (74-151)

src/ai_company/security/service.py (2)

SecOpsService (70-435)

scan_output (210-270)

src/ai_company/security/models.py (1)

OutputScanResult (152-177)

src/ai_company/security/config.py (1)

SecurityConfig (91-138)

src/ai_company/security/audit.py (2)

AuditLog (19-136)

record (49-69)

src/ai_company/security/service.py (1)

src/ai_company/security/output_scan_policy.py (11)

OutputScanResponsePolicy (29-55)

apply (41-55)

apply (69-88)

apply (105-126)

apply (144-177)

apply (224-267)

name (37-39)

name (65-67)

name (101-103)

name (140-142)

name (220-222)

tests/unit/security/test_output_scan_policy_factory.py (2)

src/ai_company/security/output_scan_policy.py (9)

AutonomyTieredPolicy (193-267)

LogOnlyPolicy (129-177)

RedactPolicy (58-88)

WithholdPolicy (91-126)

name (37-39)

name (65-67)

name (101-103)

name (140-142)

name (220-222)

tests/unit/security/test_output_scan_policy.py (1)

_make_autonomy (138-144)

src/ai_company/security/config.py (2)

src/ai_company/core/enums.py (2)

ActionType (372-408)

ApprovalRiskLevel (443-449)

src/ai_company/security/models.py (1)

SecurityVerdictType (23-32)

tests/unit/tools/test_invoker_security.py (5)

src/ai_company/tools/registry.py (1)

ToolRegistry (30-122)

src/ai_company/security/models.py (2)

SecurityVerdictType (23-32)

OutputScanResult (152-177)

src/ai_company/security/service.py (1)

scan_output (210-270)

src/ai_company/security/protocol.py (1)

scan_output (40-54)

src/ai_company/tools/invoker.py (2)

invoke (313-371)

registry (119-121)

🔇 Additional comments (18)

CLAUDE.md (1)

80-80: Documentation matches the new security surface.

This addition accurately captures the new output scan policy layer and its built-in modes in the package overview.

src/ai_company/observability/events/security.py (1)

33-33: LGTM!

The new event constant follows the established naming convention and type annotation pattern used throughout the file.

src/ai_company/security/output_scan_policy_factory.py (2)

1-21: LGTM!

Module setup follows conventions: proper imports, TYPE_CHECKING guard for EffectiveAutonomy, and correctly named logger.

24-70: LGTM!

The factory function is well-structured:

Exhaustive match on all OutputScanPolicyType members

Appropriate warning when AUTONOMY_TIERED is used without effective_autonomy

Defensive fallback with TypeError for unknown types (unreachable under normal conditions)

Google-style docstring with complete Args/Returns/Raises documentation

src/ai_company/security/__init__.py (1)

34-43: LGTM!

The new policy-related imports and __all__ entries are properly organized and maintain alphabetical ordering.

Also applies to: 50-74

src/ai_company/security/config.py (2)

17-34: LGTM!

The OutputScanPolicyType enum is well-designed:

Follows the established StrEnum pattern used elsewhere in the codebase

Clear docstring with Members section documenting each variant

Sensible default implied by docstring ("REDACT: Return redacted content — current behavior")

91-124: LGTM!

The output_scan_policy_type field is properly integrated into SecurityConfig:

Type annotation matches the enum

Default value REDACT preserves backward compatibility

Docstring updated to document the new field

tests/unit/security/test_config.py (2)

287-310: LGTM!

Comprehensive test coverage for OutputScanPolicyType:

Parametrized test covering all enum values in config integration

Validation error test for invalid types

Direct enum value assertions

173-173: LGTM!

Default value assertion correctly verifies REDACT is the default policy type.

tests/unit/security/test_output_scan_policy.py (5)

48-69: LGTM!

RedactPolicy tests correctly verify pass-through behavior for both sensitive and clean results.

74-102: LGTM!

WithholdPolicy tests verify the fail-closed behavior: redacted_content is cleared while findings are preserved for audit purposes.

107-128: LGTM!

LogOnlyPolicy tests confirm the audit-only behavior: returns empty OutputScanResult regardless of input findings.

133-231: LGTM!

AutonomyTieredPolicy tests provide excellent coverage:

Default policy map delegation for each autonomy level

Fallback behavior when effective_autonomy is None

Custom policy map override functionality

Fallback to RedactPolicy when a level is missing from a custom map

237-254: LGTM!

Protocol compliance tests verify all concrete policy classes satisfy the OutputScanResponsePolicy protocol via isinstance checks.

src/ai_company/security/service.py (3)

45-47: LGTM!

Import of OutputScanResponsePolicy with # noqa: TC001 follows the established pattern for type-only imports in this file.

81-118: LGTM!

Constructor extension is clean:

New output_scan_policy parameter with proper type annotation and default None

Docstring clearly explains the parameter's purpose and behavior when None

Stored in _output_scan_policy following the class's naming convention

256-269: LGTM!

Policy application in scan_output is well-implemented:

Correctly placed after audit recording (findings preserved in audit before policy transformation)

Fail-safe error handling: logs the exception with policy name for debugging and returns raw scan result

Consistent with exception handling patterns elsewhere in the class (MemoryError, RecursionError re-raised)

src/ai_company/security/output_scan_policy.py (1)

19-23: No action required — TYPE_CHECKING pattern is correct for Python 3.14.

The code follows the coding guideline: "Do NOT use from __future__ import annotations — Python 3.14 has PEP 649 with native lazy annotations." Under PEP 649, annotations are evaluated lazily only when accessed via introspection (e.g., typing.get_type_hints()). Since this module and its tests perform no annotation introspection, the TYPE_CHECKING pattern for Mapping, EffectiveAutonomy, and SecurityContext is safe and correct. No changes needed.

src/ai_company/security/output_scan_policy.py

tests/unit/security/test_output_scan_policy_factory.py

tests/unit/tools/test_invoker_security.py

gemini-code-assist

Code Review

This pull request introduces a well-designed and robust system for pluggable output scan response policies. The addition of four distinct policies, a configuration-driven factory, and seamless integration into the SecOpsService significantly enhances the security capabilities of the framework. The error handling is commendable, particularly the fail-safe mechanism for policy application failures. The test coverage is comprehensive, covering unit, integration, and various edge cases, which ensures high confidence in the changes. I have one minor suggestion to improve logging consistency.

_{Note: Security Review did not run due to the size of the PR.}

gemini-code-assist · 2026-03-10T22:07:22Z

src/ai_company/security/output_scan_policy.py

+                logger.warning(
+                    SECURITY_OUTPUT_SCAN_POLICY_APPLIED,
+                    policy="autonomy_tiered",
+                    autonomy_level=autonomy_level.value,
+                    note=(
+                        f"No policy mapped for autonomy level "
+                        f"'{autonomy_level.value}' — falling back to "
+                        f"'{self._fallback.name}'"
+                    ),
+                )


For consistency with the structured logging approach used elsewhere in the codebase (e.g., in LogOnlyPolicy), it's better to avoid f-strings in log messages. Passing dynamic values as separate keyword arguments makes logs more easily machine-readable and queryable.

logger.warning( SECURITY_OUTPUT_SCAN_POLICY_APPLIED, policy="autonomy_tiered", autonomy_level=autonomy_level.value, fallback_to=self._fallback.name, note="No policy mapped for autonomy level — falling back", )

Add OutputScanResponsePolicy protocol with 4 concrete strategies (RedactPolicy, WithholdPolicy, LogOnlyPolicy, AutonomyTieredPolicy) for configurable per-autonomy output scan behavior. Wire policy into SecOpsService.scan_output and add OutputScanPolicyType enum to SecurityConfig. Fill 6 test gaps across invoker and service (31 new tests total).

…nd docs Pre-reviewed by 10 agents, 17 findings addressed: - Add build_output_scan_policy() factory to wire config enum to runtime - Wrap policy.apply() in try/except (fail-safe to raw scan result) - Upgrade LogOnlyPolicy logging to WARNING when sensitive data found - Add WARNING on AutonomyTieredPolicy fallback for unmapped levels - Wrap _DEFAULT_AUTONOMY_POLICY_MAP in MappingProxyType (immutability) - Add OutputScanPolicyType config tests, SUPERVISED level test, factory tests, policy-on-clean-result test, custom-map-fallback test - Update DESIGN_SPEC.md §12.3 with policy docs and §15.3 structure - Update CLAUDE.md package structure description - Improve docstrings (WithholdPolicy findings, LogOnlyPolicy clarity, SecurityConfig attributes)

…opilot, and Greptile - Add deepcopy + MappingProxyType for AutonomyTieredPolicy policy_map immutability - Auto-wire output_scan_policy from SecurityConfig in SecOpsService - Add import-time exhaustiveness check for AutonomyLevel ↔ policy map - Use safe getattr for policy name access in scan_output error handling - Update docstrings across service, config, factory, and __init__ modules - Parametrize non-recoverable exception tests (MemoryError/RecursionError) - Add audit-before-policy-clears test for LogOnlyPolicy integration - Add unknown policy type factory test - Add immutability assertion for WithholdPolicy - Update README security feature description

Mypy resolves _audit_log without the ignore — CI was failing on unused-ignore.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/security/config.py`:
- Line 123: The default for SecurityConfig.output_scan_policy_type is set to
OutputScanPolicyType.REDACT which causes FULL and LOCKED to behave like REDACT
and hides the new autonomy-aware mapping; change the default to
OutputScanPolicyType.AUTONOMY_TIERED (or remove the hard default so the factory
selects the appropriate policy based on autonomy level) so that the
autonomy-tiered behavior is the default; update any related initialization in
the factory that reads SecurityConfig.output_scan_policy_type to expect
AUTONOMY_TIERED and ensure FULL and LOCKED retain their distinct behaviors when
that value is set.

In `@tests/unit/security/test_service.py`:
- Around line 847-849: Remove the unnecessary type-ignore on the access to the
SecOpsService audit log: delete the " # type: ignore[attr-defined]" from the
line that assigns audit_log = service._audit_log in
tests/unit/security/test_service.py (the access to service._audit_log is valid
because _audit_log is a real instance attribute of SecOpsService), leaving the
direct assignment and subsequent assertions unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0db01553-c21f-4e1e-9c39-61ee4fda15d8

📥 Commits

Reviewing files that changed from the base of the PR and between 548439c and 43ce881.

📒 Files selected for processing (14)

CLAUDE.md
DESIGN_SPEC.md
README.md
src/ai_company/observability/events/security.py
src/ai_company/security/__init__.py
src/ai_company/security/config.py
src/ai_company/security/output_scan_policy.py
src/ai_company/security/output_scan_policy_factory.py
src/ai_company/security/service.py
tests/unit/security/test_config.py
tests/unit/security/test_output_scan_policy.py
tests/unit/security/test_output_scan_policy_factory.py
tests/unit/security/test_service.py
tests/unit/tools/test_invoker_security.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Greptile Review

🧰 Additional context used

📓 Path-based instructions (5)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Python 3.14+ (PEP 649 native lazy annotations) — do NOT use from __future__ import annotations
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14
All public functions and classes must have type hints — mypy strict mode is enforced
All public classes and functions must have Google-style docstrings — enforced by ruff D rules
Create new objects instead of mutating existing ones (immutability principle); for non-Pydantic internal collections, use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Separate frozen Pydantic models for config/identity from mutable-via-copy models for runtime state that evolves; never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use @computed_field for derived values instead of storing redundant fields; use NotBlankStr for all identifier/name fields instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls) over bare create_task for structured concurrency
Line length: 88 characters maximum — enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
All provider calls must go through BaseCompletionProvider which applies retry and rate limiting automatically — never implement retry logic in driver subclasses or calling code
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig; retryable errors (RateLimitError, ...

Files:

tests/unit/security/test_config.py
tests/unit/security/test_output_scan_policy.py
src/ai_company/security/output_scan_policy_factory.py
tests/unit/security/test_output_scan_policy_factory.py
tests/unit/security/test_service.py
src/ai_company/observability/events/security.py
src/ai_company/security/config.py
src/ai_company/security/service.py
src/ai_company/security/output_scan_policy.py
tests/unit/tools/test_invoker_security.py
src/ai_company/security/__init__.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow to categorize tests
Maintain 80% minimum code coverage — enforced in CI
Use asyncio_mode = "auto" in pytest configuration — no manual @pytest.mark.asyncio needed for async tests
Set test timeout to 30 seconds per test; use pytest-xdist via -n auto for parallel execution; prefer @pytest.mark.parametrize for testing similar cases

Files:

tests/unit/security/test_config.py
tests/unit/security/test_output_scan_policy.py
tests/unit/security/test_output_scan_policy_factory.py
tests/unit/security/test_service.py
tests/unit/tools/test_invoker_security.py

**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor names (Anthropic, OpenAI, Claude, GPT, etc.) may only appear in DESIGN_SPEC.md when listing supported providers (not in other markdown files)

Files:

README.md
CLAUDE.md
DESIGN_SPEC.md

DESIGN_SPEC.md

📄 CodeRabbit inference engine (CLAUDE.md)

ALWAYS read DESIGN_SPEC.md before implementing any feature or planning any issue — the design spec is the starting point for architecture, data models, and behavior

Files:

DESIGN_SPEC.md

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Every module with business logic MUST import logging from ai_company.observability with logger = get_logger(__name__) — never use import logging, logging.getLogger(), or print() in application code
Always use constants from domain-specific modules under ai_company.observability.events for event names (e.g., PROVIDER_CALL_START from events.provider), importing directly: from ai_company.observability.events.<domain> import EVENT_CONSTANT
Use structured logging kwargs: always logger.info(EVENT, key=value) — never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising; all state transitions must log at INFO; DEBUG for object creation, internal flow, entry/exit of key functions
Pure data models, enums, and re-exports do NOT need logging
Use absolute paths or run commands directly in Bash — never use cd in Bash commands; working directory is already set to project root

Files:

src/ai_company/security/output_scan_policy_factory.py
src/ai_company/observability/events/security.py
src/ai_company/security/config.py
src/ai_company/security/service.py
src/ai_company/security/output_scan_policy.py
src/ai_company/security/__init__.py

🧠 Learnings (5)

📚 Learning: 2026-03-10T22:19:11.971Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-10T22:19:11.971Z
Learning: Applies to .claude/** : Vendor names (Anthropic, OpenAI, Claude, GPT, etc.) may appear in `.claude/` skill and agent files

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-10T22:19:11.970Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-10T22:19:11.970Z
Learning: Applies to DESIGN_SPEC.md : ALWAYS read `DESIGN_SPEC.md` before implementing any feature or planning any issue — the design spec is the starting point for architecture, data models, and behavior

Applied to files:

DESIGN_SPEC.md

📚 Learning: 2026-03-10T22:19:11.971Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-10T22:19:11.971Z
Learning: Applies to **/*.py : Create new objects instead of mutating existing ones (immutability principle); for non-Pydantic internal collections, use `copy.deepcopy()` at construction and `MappingProxyType` wrapping for read-only enforcement

Applied to files:

src/ai_company/security/output_scan_policy.py

📚 Learning: 2026-03-10T22:19:11.971Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-10T22:19:11.971Z
Learning: Applies to **/*.py : For `dict`/`list` fields in frozen Pydantic models, rely on `frozen=True` for field reassignment prevention and `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)

Applied to files:

src/ai_company/security/output_scan_policy.py

📚 Learning: 2026-03-10T22:19:11.971Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-10T22:19:11.971Z
Learning: Applies to src/**/*.py : Use structured logging kwargs: always `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`

Applied to files:

src/ai_company/security/output_scan_policy.py

🧬 Code graph analysis (9)

tests/unit/security/test_config.py (1)

src/ai_company/security/config.py (2)

OutputScanPolicyType (17-34)

SecurityConfig (91-138)

src/ai_company/security/output_scan_policy_factory.py (4)

src/ai_company/observability/_logger.py (1)

get_logger (8-28)

src/ai_company/security/config.py (1)

OutputScanPolicyType (17-34)

src/ai_company/security/output_scan_policy.py (4)

AutonomyTieredPolicy (206-282)

LogOnlyPolicy (130-178)

OutputScanResponsePolicy (30-56)

RedactPolicy (59-89)

src/ai_company/security/autonomy/models.py (1)

EffectiveAutonomy (154-191)

tests/unit/security/test_output_scan_policy_factory.py (5)

src/ai_company/core/enums.py (1)

AutonomyLevel (459-469)

src/ai_company/security/autonomy/models.py (1)

EffectiveAutonomy (154-191)

src/ai_company/security/config.py (1)

OutputScanPolicyType (17-34)

src/ai_company/security/output_scan_policy.py (9)

AutonomyTieredPolicy (206-282)

LogOnlyPolicy (130-178)

RedactPolicy (59-89)

WithholdPolicy (92-127)

name (38-40)

name (66-68)

name (102-104)

name (141-143)

name (235-237)

src/ai_company/security/output_scan_policy_factory.py (1)

build_output_scan_policy (25-73)

tests/unit/security/test_service.py (9)

src/ai_company/security/output_scan_policy.py (13)

WithholdPolicy (92-127)

OutputScanResponsePolicy (30-56)

apply (42-56)

apply (70-89)

apply (106-127)

apply (145-178)

apply (239-282)

name (38-40)

name (66-68)

name (102-104)

name (141-143)

name (235-237)

LogOnlyPolicy (130-178)

src/ai_company/security/output_scanner.py (2)

OutputScanner (31-70)

scan (34-70)

src/ai_company/security/rules/engine.py (2)

RuleEngine (32-174)

evaluate (74-151)

src/ai_company/security/service.py (2)

SecOpsService (74-456)

scan_output (223-291)

src/ai_company/security/models.py (1)

OutputScanResult (152-177)

src/ai_company/security/rules/data_leak_detector.py (2)

evaluate (83-108)

name (79-81)

src/ai_company/security/rules/policy_validator.py (2)

evaluate (57-98)

name (53-55)

src/ai_company/security/audit.py (4)

AuditLog (19-136)

record (49-69)

count (129-131)

entries (134-136)

tests/unit/security/test_output_scan_policy.py (1)

_make_context (22-30)

src/ai_company/security/config.py (2)

src/ai_company/core/enums.py (2)

ActionType (372-408)

ApprovalRiskLevel (443-449)

src/ai_company/security/models.py (1)

SecurityVerdictType (23-32)

src/ai_company/security/service.py (3)

src/ai_company/security/output_scan_policy.py (6)

OutputScanResponsePolicy (30-56)

apply (42-56)

apply (70-89)

apply (106-127)

apply (145-178)

apply (239-282)

src/ai_company/security/output_scan_policy_factory.py (1)

build_output_scan_policy (25-73)

src/ai_company/memory/errors.py (1)

MemoryError (13-14)

src/ai_company/security/output_scan_policy.py (3)

src/ai_company/core/enums.py (1)

AutonomyLevel (459-469)

src/ai_company/security/models.py (2)

OutputScanResult (152-177)

SecurityContext (70-100)

src/ai_company/security/autonomy/models.py (1)

EffectiveAutonomy (154-191)

tests/unit/tools/test_invoker_security.py (4)

src/ai_company/security/models.py (2)

SecurityVerdictType (23-32)

OutputScanResult (152-177)

src/ai_company/security/service.py (1)

scan_output (223-291)

src/ai_company/security/protocol.py (1)

scan_output (40-54)

src/ai_company/tools/invoker.py (2)

invoke (313-371)

registry (119-121)

src/ai_company/security/__init__.py (3)

src/ai_company/security/config.py (1)

OutputScanPolicyType (17-34)

src/ai_company/security/output_scan_policy.py (5)

AutonomyTieredPolicy (206-282)

LogOnlyPolicy (130-178)

OutputScanResponsePolicy (30-56)

RedactPolicy (59-89)

WithholdPolicy (92-127)

src/ai_company/security/output_scan_policy_factory.py (1)

build_output_scan_policy (25-73)

🪛 GitHub Actions: CI

tests/unit/security/test_service.py

[error] 847-847: mypy reported Unused 'type: ignore' comment [unused-ignore]. Command: 'uv run mypy src/ tests/'.

🪛 LanguageTool

README.md

[typographical] ~40-~40: To join two clauses or introduce examples, consider using an em dash.
Context: ...only - Security/Approval System (M7) - SecOps agent with rule engine (soft-allo...

(DASH_RULE)

🔇 Additional comments (10)

src/ai_company/security/output_scan_policy.py (1)

1-282: LGTM!

The implementation is well-structured with clear separation of concerns:

The OutputScanResponsePolicy protocol provides a clean contract

All four concrete policies (RedactPolicy, WithholdPolicy, LogOnlyPolicy, AutonomyTieredPolicy) implement the protocol correctly

The import-time validation (lines 194-203) ensures _DEFAULT_AUTONOMY_POLICY_MAP stays in sync with AutonomyLevel

The AutonomyTieredPolicy.__init__ now correctly freezes the policy map using copy.deepcopy() and MappingProxyType (lines 228-230), addressing the immutability requirement

src/ai_company/security/output_scan_policy_factory.py (1)

25-73: LGTM!

The factory function is well-implemented:

Exhaustive handling of all OutputScanPolicyType members via match statement

Appropriate warning when AUTONOMY_TIERED is used without effective_autonomy

Defensive error handling for unknown types with proper logging before raising TypeError

src/ai_company/security/__init__.py (1)

11-78: LGTM!

The public API surface is correctly expanded to expose:

OutputScanPolicyType enum for config-driven policy selection

OutputScanResponsePolicy protocol and all concrete implementations

build_output_scan_policy factory function

The __all__ list maintains alphabetical ordering and the module docstring is updated to document the new surface.

src/ai_company/security/service.py (2)

86-131: LGTM!

The __init__ signature extension follows the established pattern of optional dependency injection with sensible defaults. When output_scan_policy is None, the factory builds one from config.output_scan_policy_type — this provides both config-driven and programmatic flexibility.

272-290: LGTM!

The policy application logic is well-designed:

Applied after audit recording, so original findings are preserved in audit

Gracefully degrades to raw scan result on policy failure (scanner-level redaction still applies)

Correctly re-raises MemoryError and RecursionError as non-recoverable

Error logging includes policy name for debuggability

tests/unit/security/test_output_scan_policy_factory.py (1)

30-78: LGTM!

Comprehensive test coverage for the factory:

All four policy types tested with correct instance and name assertions

AUTONOMY_TIERED tested both with and without effective_autonomy

Verifies that effective_autonomy is ignored for non-tiered types

Includes the unknown policy type error path test (line 75-78)

tests/unit/security/test_config.py (1)

284-310: LGTM!

Good test coverage for OutputScanPolicyType:

Parametrized test ensures all enum members are accepted in SecurityConfig

Invalid string value correctly rejected with ValidationError

Enum values validated against expected strings

tests/unit/security/test_service.py (2)

612-665: LGTM!

Good coverage of the audit failure scenario — verifies that scan_output returns results even when audit recording fails, and that exceptions don't propagate.

668-822: LGTM!

Comprehensive policy integration tests:

Verifies policy.apply is called with correct arguments

Tests policy transformation via WithholdPolicy

Confirms policy is invoked on both sensitive and clean results

Tests default policy passthrough behavior

Covers both recoverable (returns raw result) and non-recoverable (MemoryError/RecursionError propagate) error paths

tests/unit/security/test_output_scan_policy.py (1)

1-256: LGTM!

Excellent test coverage for all policy implementations:

RedactPolicy: Verifies passthrough for both sensitive and clean results

WithholdPolicy: Tests content clearing, findings preservation, and immutability contract

LogOnlyPolicy: Confirms empty result returned regardless of input

AutonomyTieredPolicy: Covers all autonomy levels with default map, custom map override, and missing-level fallback

Protocol compliance: Parametrized test ensures all concrete classes satisfy OutputScanResponsePolicy

src/ai_company/security/config.py

tests/unit/security/test_service.py

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/security/test_service.py`:
- Line 827: Import of LogOnlyPolicy is done locally instead of alongside other
top-level policy imports; move the "LogOnlyPolicy" import to the module-level
imports to match "WithholdPolicy" and other policies. Edit the test module to
remove the local "from ai_company.security.output_scan_policy import
LogOnlyPolicy" and add it to the existing block of top-level imports so all
policy classes are imported consistently.
- Around line 675-693: The helper _make_service_with_policy returns a
SecOpsService but doesn't attach the test-only attribute _test_audit_log like
_make_service does, causing later tests to access service._audit_log directly;
update _make_service_with_policy to mirror _make_service by setting the returned
SecOpsService's _test_audit_log to the AuditLog instance (audit_log) so tests
can use the same test attribute access pattern when interacting with
service._audit_log or related assertions.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 251f1133-0efc-44c7-81b2-d0b8cbc5e688

📥 Commits

Reviewing files that changed from the base of the PR and between 43ce881 and 4a8faff.

📒 Files selected for processing (1)

tests/unit/security/test_service.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Agent
GitHub Check: Greptile Review

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Python 3.14+ (PEP 649 native lazy annotations) — do NOT use from __future__ import annotations
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14
All public functions and classes must have type hints — mypy strict mode is enforced
All public classes and functions must have Google-style docstrings — enforced by ruff D rules
Create new objects instead of mutating existing ones (immutability principle); for non-Pydantic internal collections, use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Separate frozen Pydantic models for config/identity from mutable-via-copy models for runtime state that evolves; never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use @computed_field for derived values instead of storing redundant fields; use NotBlankStr for all identifier/name fields instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls) over bare create_task for structured concurrency
Line length: 88 characters maximum — enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
All provider calls must go through BaseCompletionProvider which applies retry and rate limiting automatically — never implement retry logic in driver subclasses or calling code
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig; retryable errors (RateLimitError, ...

Files:

tests/unit/security/test_service.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow to categorize tests
Maintain 80% minimum code coverage — enforced in CI
Use asyncio_mode = "auto" in pytest configuration — no manual @pytest.mark.asyncio needed for async tests
Set test timeout to 30 seconds per test; use pytest-xdist via -n auto for parallel execution; prefer @pytest.mark.parametrize for testing similar cases

Files:

tests/unit/security/test_service.py

🔇 Additional comments (4)

tests/unit/security/test_service.py (4)

29-31: LGTM!

The TYPE_CHECKING guard for OutputScanResponsePolicy is the correct pattern—it avoids runtime import overhead while providing type hints for the test helper signature at line 679.

615-666: LGTM!

The tests correctly verify that scan_output fails safe when audit recording raises, returning the scan result without propagating the exception. This aligns with the PR objective's error handling that "fails safe to the raw scan result."

793-821: LGTM!

The parametrized test correctly verifies that non-recoverable errors (MemoryError, RecursionError) propagate as specified in the PR objectives. This aligns with the stated behavior: "MemoryError/RecursionError are re-raised."

823-850: LGTM!

This test correctly verifies that audit entries capture original findings before the policy transforms the result. The LogOnlyPolicy clears findings in the returned result (has_sensitive_data=False), but the audit entry preserves the original "Bearer token" finding. This ensures forensic traceability as mentioned in the PR objectives.

The previous review comment about the unused type: ignore on line 847 has been properly addressed.

tests/unit/security/test_service.py

src/ai_company/security/output_scan_policy.py

Copilot

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-10T22:34:22Z

src/ai_company/security/output_scan_policy.py

+        raw = policy_map if policy_map is not None else _DEFAULT_AUTONOMY_POLICY_MAP
+        self._policy_map: Mapping[AutonomyLevel, OutputScanResponsePolicy] = (
+            MappingProxyType(copy.deepcopy(dict(raw)))
+        )


AutonomyTieredPolicy deep-copies policy_map values (copy.deepcopy(dict(raw))). This is unnecessary for immutability (a shallow dict(raw) already decouples from later mutations) and can break callers that pass non-deepcopyable policy instances (e.g., mocks, policies holding locks/resources), raising at runtime during policy construction. Prefer MappingProxyType(dict(raw)) (or a shallow copy) and document that policy instances are treated as immutable/stateless.

Copilot · 2026-03-10T22:34:23Z

src/ai_company/security/output_scan_policy.py

+        )
+        if not scan_result.has_sensitive_data:
+            return scan_result
+        return scan_result.model_copy(update={"redacted_content": None})


WithholdPolicy forces redacted_content=None to trigger ToolInvoker's fail-closed branch, but ToolInvoker currently treats this as "no redacted content available" and logs/metrics that message. That becomes misleading when redaction was available but intentionally withheld by policy. Consider adding an explicit signal (e.g., a withheld flag on OutputScanResult or a distinct field/enum for outcome) and updating the invoker to log/report the correct reason when output is withheld by policy.

Suggested change

return scan_result.model_copy(update={"redacted_content": None})

return scan_result.model_copy(

update={

"redacted_content": None,

"withheld": True,

}

)

Copilot · 2026-03-10T22:34:23Z

README.md

 - **Memory Backend Adapter (M5)** - Memory protocols, retrieval pipeline, org memory, and consolidation are complete; initial Mem0 adapter backend ([ADR-001](docs/decisions/ADR-001-memory-layer.md)) pending; research backends (GraphRAG, Temporal KG) planned
 - **CLI Surface** - `cli/` package is placeholder-only
- **Security/Approval System (M7)** - SecOps agent with rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, risk classifier, and ToolInvoker integration are implemented; progressive trust (4 strategies), promotion/demotion, autonomy levels (5 tiers with presets, resolver, change strategies) and approval timeout policies (wait-forever, auto-deny, tiered, escalation-chain with task park/resume) are implemented; real authentication (JWT/OAuth) and approval workflow gates are planned
+- **Security/Approval System (M7)** - SecOps agent with rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, and ToolInvoker integration are implemented; progressive trust (4 strategies), promotion/demotion, autonomy levels (5 tiers with presets, resolver, change strategies) and approval timeout policies (wait-forever, auto-deny, tiered, escalation-chain with task park/resume) are implemented; real authentication (JWT/OAuth) and approval workflow gates are planned


This bullet is under "Not implemented yet" but states the Security/Approval System features "are implemented". Either move it to the Implemented section (or reword it as planned/in-progress) to avoid contradictory project status documentation.

Suggested change

- **Security/Approval System (M7)** - SecOps agent with rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, and ToolInvoker integration are implemented; progressive trust (4 strategies), promotion/demotion, autonomy levels (5 tiers with presets, resolver, change strategies) and approval timeout policies (wait-forever, auto-deny, tiered, escalation-chain with task park/resume) are implemented; real authentication (JWT/OAuth) and approval workflow gates are planned

- **Security/Approval System (M7)** - SecOps agent with rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, and ToolInvoker integration are partially implemented and under active development; progressive trust (4 strategies), promotion/demotion, autonomy levels (5 tiers with presets, resolver, change strategies) and approval timeout policies (wait-forever, auto-deny, tiered, escalation-chain with task park/resume) are partially implemented and continuing to evolve; real authentication (JWT/OAuth) and approval workflow gates are planned

- Change default output_scan_policy_type from REDACT to AUTONOMY_TIERED so autonomy-aware policy routing is active by default (CodeRabbit) - Replace deepcopy with shallow copy + MappingProxyType in AutonomyTieredPolicy to avoid breaking non-copyable policies (Copilot) - Document stateless/immutable expectation on OutputScanResponsePolicy protocol docstring (Greptile) - Add comment explaining intentional import-time integrity guard on _DEFAULT_AUTONOMY_POLICY_MAP (Greptile) - Add unit test asserting default map covers all AutonomyLevel members - Fix f-string in AutonomyTieredPolicy log to use structured kwargs (Gemini) - Move LogOnlyPolicy import to top-level in test_service.py (CodeRabbit) - Align _make_service_with_policy helper to attach _test_audit_log (CodeRabbit) - Fix README contradictory status: move security items to Implemented (Copilot)

greptile-apps · 2026-03-10T22:56:53Z

src/ai_company/security/service.py

+        except MemoryError, RecursionError:
+            raise


Python 3 SyntaxError: invalid multi-exception catch syntax

except MemoryError, RecursionError: is the old Python 2 binding syntax (except ExcType, varname:), which was explicitly removed in Python 3 (PEP 3110). In Python 3, this is a SyntaxError, meaning the entire service.py module will fail to import. Parentheses are required to catch multiple exception types:

Suggested change

except MemoryError, RecursionError:

raise

except (MemoryError, RecursionError):

Note: the same invalid pattern appears at lines 179, 263, 381, and 433 in this file — those are pre-existing occurrences with the same problem and should be fixed as well.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/ai_company/security/service.py Line: 280-281 Comment: **Python 3 SyntaxError: invalid multi-exception catch syntax** `except MemoryError, RecursionError:` is the old Python 2 binding syntax (`except ExcType, varname:`), which was explicitly removed in Python 3 (PEP 3110). In Python 3, this is a `SyntaxError`, meaning the entire `service.py` module will fail to import. Parentheses are required to catch multiple exception types: ```suggestion except (MemoryError, RecursionError): ``` Note: the same invalid pattern appears at lines 179, 263, 381, and 433 in this file — those are pre-existing occurrences with the same problem and should be fixed as well. How can I resolve this? If you propose a fix, please make it concise.

Copilot AI review requested due to automatic review settings March 10, 2026 21:53

Copilot started reviewing on behalf of Aureliolo March 10, 2026 21:53 View session

greptile-apps bot reviewed Mar 10, 2026

View reviewed changes

src/ai_company/security/output_scan_policy_factory.py Show resolved Hide resolved

Copilot AI reviewed Mar 10, 2026

View reviewed changes

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

src/ai_company/security/output_scan_policy.py Show resolved Hide resolved

tests/unit/security/test_output_scan_policy_factory.py Show resolved Hide resolved

tests/unit/tools/test_invoker_security.py Outdated Show resolved Hide resolved

gemini-code-assist bot reviewed Mar 10, 2026

View reviewed changes

Aureliolo added 3 commits March 10, 2026 23:20

Aureliolo force-pushed the feat/output-scanner-integration branch from 548439c to 43ce881 Compare March 10, 2026 22:24

fix: remove unused type-ignore comment in test_service

4a8faff

Mypy resolves _audit_log without the ignore — CI was failing on unused-ignore.

Copilot AI review requested due to automatic review settings March 10, 2026 22:26

Copilot started reviewing on behalf of Aureliolo March 10, 2026 22:27 View session

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

src/ai_company/security/config.py Outdated Show resolved Hide resolved

tests/unit/security/test_service.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

tests/unit/security/test_service.py Show resolved Hide resolved

tests/unit/security/test_service.py Outdated Show resolved Hide resolved

greptile-apps bot reviewed Mar 10, 2026

View reviewed changes

src/ai_company/security/output_scan_policy.py Show resolved Hide resolved

src/ai_company/security/output_scan_policy.py Show resolved Hide resolved

Copilot AI reviewed Mar 10, 2026

View reviewed changes

Aureliolo mentioned this pull request Mar 10, 2026

feat: add explicit withheld signal to OutputScanResult for WithholdPolicy #284

Closed

Aureliolo merged commit b9907e8 into main Mar 10, 2026
9 of 10 checks passed

Aureliolo deleted the feat/output-scanner-integration branch March 10, 2026 22:51

greptile-apps bot reviewed Mar 10, 2026

View reviewed changes

		Required when ``policy_type`` is ``AUTONOMY_TIERED``;
		ignored otherwise.

-            Required when ``policy_type`` is ``AUTONOMY_TIERED``;
-            ignored otherwise.
+            Used when ``policy_type`` is ``AUTONOMY_TIERED``. If ``None``
+            in that case, a warning is logged and the policy will fall
+            back to ``RedactPolicy``. Ignored for other policy types.

	except MemoryError, RecursionError:
	raise
	except (MemoryError, RecursionError):

Conversation

Aureliolo commented Mar 10, 2026

Summary

Output Scan Policies

Pre-PR Review Hardening (17 findings from 10 agents)

Test plan

Uh oh!

github-actions bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

coderabbitai bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist bot commented Mar 10, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

greptile-apps bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 1/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

github-actions bot commented Mar 10, 2026 •

edited

Loading

coderabbitai bot commented Mar 10, 2026 •

edited

Loading

greptile-apps bot commented Mar 10, 2026 •

edited

Loading