Skip to content

feat: add pluggable output scan response policies#280

Merged
Aureliolo merged 5 commits intomainfrom
feat/output-scanner-integration
Mar 10, 2026
Merged

feat: add pluggable output scan response policies#280
Aureliolo merged 5 commits intomainfrom
feat/output-scanner-integration

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

  • Add pluggable output scan response policies (OutputScanResponsePolicy protocol) with 4 built-in strategies: RedactPolicy, WithholdPolicy, LogOnlyPolicy, AutonomyTieredPolicy
  • Add OutputScanPolicyType config enum and build_output_scan_policy() factory for config-driven policy selection
  • Integrate policy into SecOpsService.scan_output() with proper error handling (fail-safe to raw scan result)
  • Simplify Docker CI scan configuration (merge Trivy CRITICAL+HIGH into single step, raise Grype cutoff to HIGH, remove CVE ignore files)
  • Comprehensive test coverage: policy unit tests, factory tests, service integration tests, invoker gap tests (6691 tests pass, 94.83% coverage)

Output Scan Policies

Policy Behavior Default Autonomy Level
Redact (default) Pass through scanner's redaction SEMI, SUPERVISED
Withhold Clear redacted content — fail-closed LOCKED
Log-only Log findings at WARNING, pass original output FULL
Autonomy-tiered Delegate to sub-policy based on autonomy level Composite

Pre-PR Review Hardening (17 findings from 10 agents)

  • Error handling around policy.apply() with MemoryError/RecursionError re-raise
  • LogOnlyPolicy upgraded from DEBUG to WARNING when sensitive data detected
  • AutonomyTieredPolicy warns on unmapped autonomy level fallback
  • _DEFAULT_AUTONOMY_POLICY_MAP wrapped in MappingProxyType (immutability)
  • Factory function wires OutputScanPolicyType config enum to runtime policy instances
  • Docstring improvements (WithholdPolicy findings preservation, LogOnlyPolicy clarity, SecurityConfig attributes)
  • DESIGN_SPEC.md §12.3 and §15.3 updated with policy documentation

Test plan

  • All 4 policies tested for sensitive and clean inputs
  • Protocol compliance verified via isinstance checks
  • AutonomyTieredPolicy tested for all 4 autonomy levels + None fallback + custom map + missing key fallback
  • Factory tested for all enum members + autonomy integration
  • Service integration: policy applied after scan, policy transforms result, no-policy passthrough, policy failure returns raw result, MemoryError propagates
  • Config: OutputScanPolicyType defaults, all members accepted, invalid rejected, JSON roundtrip
  • Invoker gaps: non-recoverable errors propagate, tool errors skip scan, content passing, invoke_all scanning, soft error scanning
  • Audit failure isolation preserved
  • 6691 passed, 9 skipped, 94.83% coverage

Closes #263

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings March 10, 2026 21:53
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 10, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 10, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1405b851-bfd4-4fb6-b7f1-a6e4e04cec42

📥 Commits

Reviewing files that changed from the base of the PR and between 4a8faff and 2e3a3aa.

📒 Files selected for processing (6)
  • README.md
  • src/ai_company/security/config.py
  • src/ai_company/security/output_scan_policy.py
  • tests/unit/security/test_config.py
  • tests/unit/security/test_output_scan_policy.py
  • tests/unit/security/test_service.py

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Configurable post-scan response system with four policies: redact (default), withhold, log-only, and autonomy‑tiered; service applies the selected policy after output scanning.
    • Added an observability event for when a policy is applied and config-based policy selection.
  • Documentation

    • Updated design docs and README to describe output-scan policy options and configuration.
  • Tests

    • Added comprehensive unit tests for policies, factory selection, service integration, and tool-invoker scenarios.

Walkthrough

Adds a pluggable post-output handling layer: new OutputScanPolicyType config enum and SecurityConfig field, a Policy protocol with four concrete policies and default autonomy→policy map, a factory to build policies, SecOpsService wiring to apply policies after scanning/audit (with observability), and accompanying unit tests and docs updates.

Changes

Cohort / File(s) Summary
Config & docs
src/ai_company/security/config.py, CLAUDE.md, DESIGN_SPEC.md, README.md
Add OutputScanPolicyType enum and SecurityConfig.output_scan_policy_type (default AUTONOMY_TIERED per design); update docs to list "output scan response policies".
Policy implementations
src/ai_company/security/output_scan_policy.py
Add OutputScanResponsePolicy Protocol and four implementations: RedactPolicy, WithholdPolicy, LogOnlyPolicy, AutonomyTieredPolicy; include _DEFAULT_AUTONOMY_POLICY_MAP and emit SECURITY_OUTPUT_SCAN_POLICY_APPLIED when applied.
Policy factory
src/ai_company/security/output_scan_policy_factory.py
Add build_output_scan_policy() to construct policy instances from OutputScanPolicyType; handle AUTONOMY_TIERED with optional effective_autonomy, log on missing autonomy, and error on unknown types.
Service integration
src/ai_company/security/service.py
SecOpsService gains optional output_scan_policy parameter (or builds one from config) and applies the policy in scan_output() after scanning and audit; logs and returns raw redacted result on policy apply failure.
Public API surface
src/ai_company/security/__init__.py
Expose OutputScanPolicyType, OutputScanResponsePolicy, RedactPolicy, WithholdPolicy, LogOnlyPolicy, AutonomyTieredPolicy, and build_output_scan_policy on the security package surface.
Observability
src/ai_company/observability/events/security.py
Add event constant SECURITY_OUTPUT_SCAN_POLICY_APPLIED.
Tests — policies & factory
tests/unit/security/test_output_scan_policy.py, tests/unit/security/test_output_scan_policy_factory.py
Comprehensive tests for policy behavior, delegation, custom maps, factory instantiation, and protocol compliance.
Tests — config
tests/unit/security/test_config.py
Tests for OutputScanPolicyType integration in SecurityConfig, defaults, JSON round-trip, and validation.
Tests — service & tools integration
tests/unit/security/test_service.py, tests/unit/tools/test_invoker_security.py
Tests covering policy application in scan_output(), audit interaction and error handling, and ToolInvoker output-scanning flows (soft errors, multiple calls, denial/escalation).

Sequence Diagram(s)

sequenceDiagram
    participant Tool as Tool
    participant SecOps as SecOpsService
    participant Scanner as OutputScanner
    participant Audit as AuditLog
    participant Policy as OutputScanPolicy
    participant Context as SecurityContext

    Tool->>SecOps: return output content
    SecOps->>Scanner: scan(content)
    Scanner-->>SecOps: OutputScanResult
    SecOps->>Audit: record(scan_result)
    Audit-->>SecOps: recorded
    alt output_scan_policy configured
        SecOps->>Policy: apply(scan_result, Context)
        Policy-->>SecOps: transformed OutputScanResult
        SecOps->>Context: emit SECURITY_OUTPUT_SCAN_POLICY_APPLIED
    else no policy or policy fails
        Note over SecOps: return raw (redacted) scan_result
    end
    SecOps-->>Tool: sanitized result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.53% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding pluggable output scan response policies. It directly reflects the primary feature introduced in the changeset.
Description check ✅ Passed The description is comprehensive and directly related to the changeset. It outlines the policies, configuration, integration, test coverage, and links to the related issue, all of which are reflected in the code changes.
Linked Issues check ✅ Passed The PR successfully implements the core objectives from issue #263: OutputScanner integration into ToolInvoker, configurable response policies per autonomy level, and post-tool scanning with redaction.
Out of Scope Changes check ✅ Passed All code changes are within scope of issue #263 and the PR objectives. Minor documentation updates and CI improvements are supportive of the main feature and not out-of-scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/output-scanner-integration
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch feat/output-scanner-integration

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the security framework by introducing a pluggable policy system for handling the results of output scans. It allows the system to respond to detected sensitive data in a configurable manner, such as redacting, withholding, or merely logging the findings, with behavior potentially varying based on the agent's autonomy level. This provides greater control and flexibility in managing sensitive information, improving the overall security posture without disrupting core operations.

Highlights

  • Pluggable Output Scan Response Policies: Introduced a new OutputScanResponsePolicy protocol with four built-in strategies: RedactPolicy, WithholdPolicy, LogOnlyPolicy, and AutonomyTieredPolicy.
  • Config-Driven Policy Selection: Added an OutputScanPolicyType configuration enum and a build_output_scan_policy() factory function for selecting policies based on configuration.
  • SecOpsService Integration: Integrated the new policy system into SecOpsService.scan_output(), ensuring proper error handling that defaults to the raw scan result in case of policy application failure.
  • Documentation and Hardening: Updated DESIGN_SPEC.md and CLAUDE.md with policy documentation, and addressed pre-PR review findings including error handling, logging levels, and immutability of default policy maps.
  • Comprehensive Test Coverage: Ensured extensive test coverage for policy units, factory functions, service integration, and invoker gap scenarios, with 6691 tests passing and 94.83% coverage.
Changelog
  • CLAUDE.md
    • Updated the security section to include output scan response policies.
  • DESIGN_SPEC.md
    • Updated the implementation snapshot to reflect the completion of output scan response policies.
    • Added a new section detailing the functionality and behavior of output scan response policies.
  • src/ai_company/observability/events/security.py
    • Added a new event constant, SECURITY_OUTPUT_SCAN_POLICY_APPLIED, for logging policy application.
  • src/ai_company/security/init.py
    • Imported and exposed new security configuration and policy-related modules and classes.
  • src/ai_company/security/config.py
    • Introduced the OutputScanPolicyType enum for declarative policy selection.
    • Added output_scan_policy_type field to SecurityConfig to configure the active policy.
  • src/ai_company/security/output_scan_policy.py
    • Added a new file defining the OutputScanResponsePolicy protocol.
    • Implemented RedactPolicy, WithholdPolicy, LogOnlyPolicy, and AutonomyTieredPolicy as concrete policy strategies.
  • src/ai_company/security/output_scan_policy_factory.py
    • Added a new file containing the build_output_scan_policy factory function for creating policy instances.
  • src/ai_company/security/service.py
    • Modified the SecOpsService to accept and apply an OutputScanResponsePolicy after scanning tool output.
    • Implemented robust error handling for policy application, ensuring a fail-safe return of raw scan results.
  • tests/unit/security/test_config.py
    • Added tests for the OutputScanPolicyType enum, including acceptance in SecurityConfig and JSON roundtrip validation.
  • tests/unit/security/test_output_scan_policy.py
    • Added a new file with comprehensive unit tests for RedactPolicy, WithholdPolicy, LogOnlyPolicy, and AutonomyTieredPolicy.
  • tests/unit/security/test_output_scan_policy_factory.py
    • Added a new file with unit tests for the build_output_scan_policy factory function.
  • tests/unit/security/test_service.py
    • Extended tests for SecOpsService to cover audit failure isolation and the integration of output scan policies.
    • Added tests for policy application, result transformation, and error propagation scenarios.
  • tests/unit/tools/test_invoker_security.py
    • Added new test cases for invoker gaps, including non-recoverable errors during scanning, skipping scans on tool execution errors, content passing to scan_output, invoke_all output scanning, and scanning of soft error content.
Activity
  • Pre-PR review hardening addressed 17 findings from 10 agents, improving error handling, logging, and immutability.
  • Error handling for MemoryError and RecursionError during policy application was implemented to re-raise these critical exceptions.
  • LogOnlyPolicy was updated to log findings at a WARNING level when sensitive data is detected.
  • AutonomyTieredPolicy now includes warnings for unmapped autonomy levels, falling back gracefully.
  • The default autonomy-to-policy map (_DEFAULT_AUTONOMY_POLICY_MAP) was wrapped in MappingProxyType for immutability.
  • Comprehensive test coverage was added, including unit tests for all four policies, factory functions, and service integration, ensuring correct behavior across various scenarios.
  • Invoker gap tests were added to verify behavior in edge cases such as non-recoverable errors, tool execution failures, and invoke_all scenarios.
  • All 6691 tests passed, with 9 skipped, achieving 94.83% code coverage.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Mar 10, 2026

Greptile Summary

This PR introduces a pluggable output scan response policy system for the SecOpsService, adding a OutputScanResponsePolicy protocol with four built-in strategies (RedactPolicy, WithholdPolicy, LogOnlyPolicy, AutonomyTieredPolicy), a config enum (OutputScanPolicyType), and a factory function (build_output_scan_policy). The policies slot cleanly into SecOpsService.scan_output() via constructor injection or config-driven defaults, and the design is well-aligned with the codebase's protocol-first, fail-closed philosophy.

Key highlights and findings:

  • Critical syntax error: The newly added except MemoryError, RecursionError: at service.py:280 uses Python 2 exception-binding syntax that was explicitly removed in Python 3 (PEP 3110). This is a SyntaxError that prevents service.py from importing. The same pre-existing pattern appears at lines 179, 263, 381, and 433 and should be corrected to except (MemoryError, RecursionError): throughout.
  • The output_scan_policy.py implementation is clean and well-reasoned: MappingProxyType immutability, shallow-copy decoupling for custom policy_map, import-time integrity check with a companion CI test, and a WARNING-level log for LogOnlyPolicy when sensitive data is detected.
  • Previous review findings (event constant semantics, deepcopy removal, AutonomyTieredPolicy fallback warning) have all been addressed.
  • Test coverage is thorough: protocol compliance, all four autonomy levels, custom map fallback, service integration, and error-propagation cases.

Confidence Score: 1/5

  • Not safe to merge — service.py contains a Python 3 SyntaxError that will prevent the module from loading.
  • The new except MemoryError, RecursionError: at line 280 of service.py (and identical pre-existing patterns at lines 179, 263, 381, 433) uses the Python 2 comma-binding syntax removed by PEP 3110. In Python 3 this is a SyntaxError, making service.py unimportable. Everything else — the policy protocol, implementations, factory, config, and tests — is well-structured and correct. The score would be 4–5 once the syntax is fixed to except (MemoryError, RecursionError):.
  • src/ai_company/security/service.py — all five except MemoryError, RecursionError: occurrences must be parenthesised.

Important Files Changed

Filename Overview
src/ai_company/security/output_scan_policy.py Introduces the OutputScanResponsePolicy protocol and four concrete implementations (RedactPolicy, WithholdPolicy, LogOnlyPolicy, AutonomyTieredPolicy), along with the _DEFAULT_AUTONOMY_POLICY_MAP with an import-time integrity check. Implementation is clean, well-documented, and the previous review comments (deepcopy removal, MappingProxyType immutability) have been addressed.
src/ai_company/security/service.py Integrates the output scan policy into scan_output() and adds output_scan_policy constructor injection. Contains a critical Python 3 SyntaxError: except MemoryError, RecursionError: (Python 2 syntax, removed in PEP 3110) at line 280 — the new addition — plus pre-existing identical syntax errors at lines 179, 263, 381, and 433. The module will fail to import in Python 3.
src/ai_company/security/output_scan_policy_factory.py Clean factory function mapping OutputScanPolicyType enum values to policy instances via match. Previous review concern (wrong event constant) was already addressed — SECURITY_CONFIG_LOADED is now correctly used for the factory-time warning.
src/ai_company/security/config.py Adds OutputScanPolicyType StrEnum and output_scan_policy_type field to SecurityConfig with a sensible default of AUTONOMY_TIERED. Well-structured with clear docstrings.
src/ai_company/security/init.py Correctly exports all new public symbols (OutputScanPolicyType, all four policy classes, build_output_scan_policy) and updates __all__.
tests/unit/security/test_output_scan_policy.py Comprehensive policy unit tests covering all four strategies, protocol compliance via isinstance, all autonomy level combinations, custom map overrides, and the _DEFAULT_AUTONOMY_POLICY_MAP integrity assertion.
tests/unit/security/test_service.py New TestSecOpsScanOutputPolicy class tests policy application, transformation, no-policy passthrough, policy failure fallback, and MemoryError/RecursionError propagation. Coverage looks solid.
src/ai_company/observability/events/security.py Adds SECURITY_OUTPUT_SCAN_POLICY_APPLIED event constant. Clean addition consistent with the existing constant style.

Sequence Diagram

sequenceDiagram
    participant Invoker
    participant SecOpsService
    participant OutputScanner
    participant AuditLog
    participant OutputScanResponsePolicy

    Invoker->>SecOpsService: scan_output(context, output)
    SecOpsService->>OutputScanner: scan(output)
    OutputScanner-->>SecOpsService: OutputScanResult
    alt has_sensitive_data AND audit_enabled
        SecOpsService->>AuditLog: record(AuditEntry)
    end
    SecOpsService->>OutputScanResponsePolicy: apply(scan_result, context)
    note over OutputScanResponsePolicy: RedactPolicy → pass-through<br/>WithholdPolicy → clear redacted_content<br/>LogOnlyPolicy → return OutputScanResult()<br/>AutonomyTieredPolicy → delegate by level
    OutputScanResponsePolicy-->>SecOpsService: transformed OutputScanResult
    alt policy.apply() raises (non MemoryError/RecursionError)
        SecOpsService-->>Invoker: raw scan_result (fail-safe)
    else success
        SecOpsService-->>Invoker: transformed OutputScanResult
    end
Loading

Last reviewed commit: 2e3a3aa

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces pluggable output scan response policies to control how sensitive-output findings are handled after OutputScanner runs, with supporting config surface and expanded unit/service tests. This extends the security subsystem’s post-tool output scanning so it can be tuned by strategy (redact/withhold/log-only/autonomy-tiered) rather than being hard-coded.

Changes:

  • Add OutputScanResponsePolicy protocol + built-in policy implementations and a factory (build_output_scan_policy) for enum-driven selection.
  • Integrate an optional output_scan_policy into SecOpsService.scan_output() (policy applied after audit recording; failures fall back to raw scan result).
  • Add/extend unit tests covering policy behavior, factory behavior, and invoker/service integration gaps; update docs/events/exports accordingly.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/ai_company/security/output_scan_policy.py Adds policy protocol + concrete policy implementations (redact/withhold/log-only/autonomy-tiered).
src/ai_company/security/output_scan_policy_factory.py Adds config-enum-to-policy factory for constructing policy instances.
src/ai_company/security/service.py Applies an optional response policy after scanning/auditing in scan_output().
src/ai_company/security/config.py Adds OutputScanPolicyType enum + SecurityConfig.output_scan_policy_type default.
src/ai_company/security/__init__.py Exposes new policy types/factory via package public API.
src/ai_company/observability/events/security.py Adds a new event constant for policy-related logging.
tests/unit/security/test_output_scan_policy.py New unit tests for each policy + protocol compliance checks.
tests/unit/security/test_output_scan_policy_factory.py New tests validating factory outputs for each enum value.
tests/unit/security/test_service.py Adds scan_output audit-failure isolation + policy integration tests.
tests/unit/security/test_config.py Adds config default/roundtrip/validation coverage for OutputScanPolicyType.
tests/unit/tools/test_invoker_security.py Adds invoker “gap tests” around output scanning behavior and error propagation.
DESIGN_SPEC.md / CLAUDE.md Documents the new policy concept and updates implementation snapshot.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 88 to 92
approval_store: ApprovalStore | None = None,
effective_autonomy: EffectiveAutonomy | None = None,
risk_classifier: RiskTierClassifier | None = None,
output_scan_policy: OutputScanResponsePolicy | None = None,
) -> None:
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SecurityConfig now exposes output_scan_policy_type, but the default SecOpsService wiring won’t use it unless callers explicitly pass output_scan_policy. This means changing config has no effect in the common path. Consider constructing a default policy in SecOpsService.__init__ when output_scan_policy is None (via build_output_scan_policy(config.output_scan_policy_type, effective_autonomy=effective_autonomy)), while keeping constructor injection as an override.

Copilot uses AI. Check for mistakes.
Comment on lines +34 to +35
Required when ``policy_type`` is ``AUTONOMY_TIERED``;
ignored otherwise.
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring says effective_autonomy is “Required when policy_type is AUTONOMY_TIERED”, but the implementation explicitly allows None (it logs a warning and falls back to RedactPolicy). Update the docstring to match the actual behavior, or enforce the requirement by raising a ValueError when policy_type is AUTONOMY_TIERED and effective_autonomy is None.

Suggested change
Required when ``policy_type`` is ``AUTONOMY_TIERED``;
ignored otherwise.
Used when ``policy_type`` is ``AUTONOMY_TIERED``. If ``None``
in that case, a warning is logged and the policy will fall
back to ``RedactPolicy``. Ignored for other policy types.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/ai_company/security/__init__.py (1)

1-14: 🧹 Nitpick | 🔵 Trivial

Update the module docstring to mention the new policy types.

The public API docstring lists OutputScanResult / OutputScanner but doesn't mention the newly exported policy types (OutputScanPolicyType, OutputScanResponsePolicy, policy implementations, build_output_scan_policy).

📝 Suggested docstring update
 - ``OutputScanResult`` / ``OutputScanner`` — post-tool output scanning.
+- ``OutputScanResponsePolicy`` — protocol for output scan policies.
+- ``RedactPolicy`` / ``WithholdPolicy`` / ``LogOnlyPolicy`` / ``AutonomyTieredPolicy`` — policy implementations.
+- ``OutputScanPolicyType`` / ``build_output_scan_policy`` — config-driven policy selection.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/security/__init__.py` around lines 1 - 14, Update the module
docstring at the top of src/ai_company/security/__init__.py to include the new
output-scan policy exports by adding mentions of OutputScanPolicyType,
OutputScanResponsePolicy, the policy implementations, and
build_output_scan_policy alongside the existing OutputScanResult / OutputScanner
entry; edit the public API list so it explicitly documents these symbols
(OutputScanPolicyType, OutputScanResponsePolicy, build_output_scan_policy, plus
any concrete policy class names) so they are discoverable in the module
docstring.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/security/output_scan_policy.py`:
- Around line 203-216: The constructor currently stores the caller-supplied
policy_map by reference, allowing external mutation; in the __init__ where
policy_map is handled (the policy_map parameter and self._policy_map
assignment), make a deep copy of the provided mapping (use copy.deepcopy) and
then wrap that copy with MappingProxyType before assigning to self._policy_map;
when policy_map is None still assign a MappingProxyType-wrapped copy of
_DEFAULT_AUTONOMY_POLICY_MAP so the stored map is always frozen and cannot be
mutated by callers.

In `@tests/unit/security/test_output_scan_policy_factory.py`:
- Around line 30-73: Add a unit test verifying the factory's defensive TypeError
for unknown types: inside TestBuildOutputScanPolicy add a method (e.g.
test_unknown_policy_type_raises_type_error) that calls build_output_scan_policy
with an invalid value (e.g. a string, using type: ignore) and asserts it raises
TypeError with a message containing "Unknown output scan policy type"; reference
build_output_scan_policy and the defensive branch in
output_scan_policy_factory.py to locate the behavior to test.

In `@tests/unit/tools/test_invoker_security.py`:
- Around line 624-656: Combine the two nearly identical tests
test_memory_error_from_scan_propagates and
test_recursion_error_from_scan_propagates into one parametrized test (e.g.,
test_nonrecoverable_scan_exceptions) using `@pytest.mark.parametrize` to iterate
over the exception instances or types (MemoryError("oom"), RecursionError("max
depth")) and their corresponding exception classes; in the single test set
interceptor = _make_interceptor(...), set interceptor.scan_output =
AsyncMock(side_effect=exception), create the ToolInvoker and assert with
pytest.raises(expected_exception) that await invoker.invoke(tool_call)
propagates the error. Ensure you reference and reuse the same setup
(security_registry, tool_call, _make_interceptor, ToolInvoker,
interceptor.scan_output) so adding more non-recoverable exceptions is simple.

---

Outside diff comments:
In `@src/ai_company/security/__init__.py`:
- Around line 1-14: Update the module docstring at the top of
src/ai_company/security/__init__.py to include the new output-scan policy
exports by adding mentions of OutputScanPolicyType, OutputScanResponsePolicy,
the policy implementations, and build_output_scan_policy alongside the existing
OutputScanResult / OutputScanner entry; edit the public API list so it
explicitly documents these symbols (OutputScanPolicyType,
OutputScanResponsePolicy, build_output_scan_policy, plus any concrete policy
class names) so they are discoverable in the module docstring.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f6a44960-3110-454c-a6b5-7bfa251531a0

📥 Commits

Reviewing files that changed from the base of the PR and between a488758 and 548439c.

📒 Files selected for processing (13)
  • CLAUDE.md
  • DESIGN_SPEC.md
  • src/ai_company/observability/events/security.py
  • src/ai_company/security/__init__.py
  • src/ai_company/security/config.py
  • src/ai_company/security/output_scan_policy.py
  • src/ai_company/security/output_scan_policy_factory.py
  • src/ai_company/security/service.py
  • tests/unit/security/test_config.py
  • tests/unit/security/test_output_scan_policy.py
  • tests/unit/security/test_output_scan_policy_factory.py
  • tests/unit/security/test_service.py
  • tests/unit/tools/test_invoker_security.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Agent
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do NOT use from __future__ import annotations — Python 3.14 has PEP 649 with native lazy annotations
Use except A, B: syntax (no parentheses) for exception handling — PEP 758 style, enforced by ruff on Python 3.14
Add type hints to all public functions; enforce with mypy strict mode
Use Google-style docstrings, required on public classes and functions (enforced by ruff D rules)
Create new objects instead of mutating existing ones; for non-Pydantic internal collections use copy.deepcopy() at construction and wrap with MappingProxyType for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with model_copy(update=...) for runtime state that evolves
Use Pydantic v2 BaseModel, model_validator, computed_field, and ConfigDict; use @computed_field for derived values instead of storing redundant fields
Use NotBlankStr from core.types for all identifier/name fields (including optional NotBlankStr | None and tuple[NotBlankStr, ...] variants) instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls) over bare create_task
Maintain line length of 88 characters (enforced by ruff)
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly; never silently swallow errors
Validate at system boundaries: user input, external APIs, and config files
Every module with business logic MUST have: from ai_company.observability import get_logger then logger = get_logger(__name__)
Always use variable name logger (not _logger, not log) for logger instances
Always use event name constants from domain-specific mo...

Files:

  • src/ai_company/security/__init__.py
  • tests/unit/security/test_output_scan_policy.py
  • tests/unit/security/test_config.py
  • src/ai_company/security/output_scan_policy.py
  • src/ai_company/security/output_scan_policy_factory.py
  • tests/unit/security/test_service.py
  • src/ai_company/security/service.py
  • src/ai_company/observability/events/security.py
  • tests/unit/security/test_output_scan_policy_factory.py
  • src/ai_company/security/config.py
  • tests/unit/tools/test_invoker_security.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Never use import logging, logging.getLogger(), or print() in application code

Files:

  • src/ai_company/security/__init__.py
  • src/ai_company/security/output_scan_policy.py
  • src/ai_company/security/output_scan_policy_factory.py
  • src/ai_company/security/service.py
  • src/ai_company/observability/events/security.py
  • src/ai_company/security/config.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples; use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Vendor names may only appear in: (1) DESIGN_SPEC.md provider list, (2) .claude/ skill/agent files, (3) third-party import paths/module names. Tests must use test-provider, test-small-001, etc.

Files:

  • src/ai_company/security/__init__.py
  • tests/unit/security/test_output_scan_policy.py
  • tests/unit/security/test_config.py
  • src/ai_company/security/output_scan_policy.py
  • src/ai_company/security/output_scan_policy_factory.py
  • tests/unit/security/test_service.py
  • src/ai_company/security/service.py
  • src/ai_company/observability/events/security.py
  • tests/unit/security/test_output_scan_policy_factory.py
  • src/ai_company/security/config.py
  • tests/unit/tools/test_invoker_security.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow for test categorization
Use asyncio_mode = 'auto' for async tests — no manual @pytest.mark.asyncio needed
Set test timeout to 30 seconds per test
Use pytest-xdist via -n auto for parallel test execution
Prefer @pytest.mark.parametrize for testing similar cases

Files:

  • tests/unit/security/test_output_scan_policy.py
  • tests/unit/security/test_config.py
  • tests/unit/security/test_service.py
  • tests/unit/security/test_output_scan_policy_factory.py
  • tests/unit/tools/test_invoker_security.py
🧬 Code graph analysis (9)
tests/unit/security/test_output_scan_policy.py (4)
src/ai_company/core/enums.py (2)
  • AutonomyLevel (459-469)
  • ToolCategory (294-308)
src/ai_company/security/autonomy/models.py (1)
  • EffectiveAutonomy (154-191)
src/ai_company/security/models.py (2)
  • OutputScanResult (152-177)
  • SecurityContext (70-100)
src/ai_company/security/output_scan_policy.py (10)
  • AutonomyTieredPolicy (193-267)
  • LogOnlyPolicy (129-177)
  • OutputScanResponsePolicy (29-55)
  • RedactPolicy (58-88)
  • WithholdPolicy (91-126)
  • apply (41-55)
  • apply (69-88)
  • apply (105-126)
  • apply (144-177)
  • apply (224-267)
tests/unit/security/test_config.py (1)
src/ai_company/security/config.py (2)
  • OutputScanPolicyType (17-34)
  • SecurityConfig (91-138)
src/ai_company/security/output_scan_policy.py (4)
src/ai_company/core/enums.py (1)
  • AutonomyLevel (459-469)
src/ai_company/observability/_logger.py (1)
  • get_logger (8-28)
src/ai_company/security/models.py (1)
  • OutputScanResult (152-177)
src/ai_company/security/autonomy/models.py (1)
  • EffectiveAutonomy (154-191)
src/ai_company/security/output_scan_policy_factory.py (3)
src/ai_company/security/config.py (1)
  • OutputScanPolicyType (17-34)
src/ai_company/security/output_scan_policy.py (4)
  • LogOnlyPolicy (129-177)
  • OutputScanResponsePolicy (29-55)
  • RedactPolicy (58-88)
  • WithholdPolicy (91-126)
src/ai_company/security/autonomy/models.py (1)
  • EffectiveAutonomy (154-191)
tests/unit/security/test_service.py (7)
src/ai_company/security/output_scan_policy.py (12)
  • WithholdPolicy (91-126)
  • OutputScanResponsePolicy (29-55)
  • apply (41-55)
  • apply (69-88)
  • apply (105-126)
  • apply (144-177)
  • apply (224-267)
  • name (37-39)
  • name (65-67)
  • name (101-103)
  • name (140-142)
  • name (220-222)
src/ai_company/security/output_scanner.py (2)
  • OutputScanner (31-70)
  • scan (34-70)
src/ai_company/security/rules/engine.py (2)
  • RuleEngine (32-174)
  • evaluate (74-151)
src/ai_company/security/service.py (2)
  • SecOpsService (70-435)
  • scan_output (210-270)
src/ai_company/security/models.py (1)
  • OutputScanResult (152-177)
src/ai_company/security/config.py (1)
  • SecurityConfig (91-138)
src/ai_company/security/audit.py (2)
  • AuditLog (19-136)
  • record (49-69)
src/ai_company/security/service.py (1)
src/ai_company/security/output_scan_policy.py (11)
  • OutputScanResponsePolicy (29-55)
  • apply (41-55)
  • apply (69-88)
  • apply (105-126)
  • apply (144-177)
  • apply (224-267)
  • name (37-39)
  • name (65-67)
  • name (101-103)
  • name (140-142)
  • name (220-222)
tests/unit/security/test_output_scan_policy_factory.py (2)
src/ai_company/security/output_scan_policy.py (9)
  • AutonomyTieredPolicy (193-267)
  • LogOnlyPolicy (129-177)
  • RedactPolicy (58-88)
  • WithholdPolicy (91-126)
  • name (37-39)
  • name (65-67)
  • name (101-103)
  • name (140-142)
  • name (220-222)
tests/unit/security/test_output_scan_policy.py (1)
  • _make_autonomy (138-144)
src/ai_company/security/config.py (2)
src/ai_company/core/enums.py (2)
  • ActionType (372-408)
  • ApprovalRiskLevel (443-449)
src/ai_company/security/models.py (1)
  • SecurityVerdictType (23-32)
tests/unit/tools/test_invoker_security.py (5)
src/ai_company/tools/registry.py (1)
  • ToolRegistry (30-122)
src/ai_company/security/models.py (2)
  • SecurityVerdictType (23-32)
  • OutputScanResult (152-177)
src/ai_company/security/service.py (1)
  • scan_output (210-270)
src/ai_company/security/protocol.py (1)
  • scan_output (40-54)
src/ai_company/tools/invoker.py (2)
  • invoke (313-371)
  • registry (119-121)
🔇 Additional comments (18)
CLAUDE.md (1)

80-80: Documentation matches the new security surface.

This addition accurately captures the new output scan policy layer and its built-in modes in the package overview.

src/ai_company/observability/events/security.py (1)

33-33: LGTM!

The new event constant follows the established naming convention and type annotation pattern used throughout the file.

src/ai_company/security/output_scan_policy_factory.py (2)

1-21: LGTM!

Module setup follows conventions: proper imports, TYPE_CHECKING guard for EffectiveAutonomy, and correctly named logger.


24-70: LGTM!

The factory function is well-structured:

  • Exhaustive match on all OutputScanPolicyType members
  • Appropriate warning when AUTONOMY_TIERED is used without effective_autonomy
  • Defensive fallback with TypeError for unknown types (unreachable under normal conditions)
  • Google-style docstring with complete Args/Returns/Raises documentation
src/ai_company/security/__init__.py (1)

34-43: LGTM!

The new policy-related imports and __all__ entries are properly organized and maintain alphabetical ordering.

Also applies to: 50-74

src/ai_company/security/config.py (2)

17-34: LGTM!

The OutputScanPolicyType enum is well-designed:

  • Follows the established StrEnum pattern used elsewhere in the codebase
  • Clear docstring with Members section documenting each variant
  • Sensible default implied by docstring ("REDACT: Return redacted content — current behavior")

91-124: LGTM!

The output_scan_policy_type field is properly integrated into SecurityConfig:

  • Type annotation matches the enum
  • Default value REDACT preserves backward compatibility
  • Docstring updated to document the new field
tests/unit/security/test_config.py (2)

287-310: LGTM!

Comprehensive test coverage for OutputScanPolicyType:

  • Parametrized test covering all enum values in config integration
  • Validation error test for invalid types
  • Direct enum value assertions

173-173: LGTM!

Default value assertion correctly verifies REDACT is the default policy type.

tests/unit/security/test_output_scan_policy.py (5)

48-69: LGTM!

RedactPolicy tests correctly verify pass-through behavior for both sensitive and clean results.


74-102: LGTM!

WithholdPolicy tests verify the fail-closed behavior: redacted_content is cleared while findings are preserved for audit purposes.


107-128: LGTM!

LogOnlyPolicy tests confirm the audit-only behavior: returns empty OutputScanResult regardless of input findings.


133-231: LGTM!

AutonomyTieredPolicy tests provide excellent coverage:

  • Default policy map delegation for each autonomy level
  • Fallback behavior when effective_autonomy is None
  • Custom policy map override functionality
  • Fallback to RedactPolicy when a level is missing from a custom map

237-254: LGTM!

Protocol compliance tests verify all concrete policy classes satisfy the OutputScanResponsePolicy protocol via isinstance checks.

src/ai_company/security/service.py (3)

45-47: LGTM!

Import of OutputScanResponsePolicy with # noqa: TC001 follows the established pattern for type-only imports in this file.


81-118: LGTM!

Constructor extension is clean:

  • New output_scan_policy parameter with proper type annotation and default None
  • Docstring clearly explains the parameter's purpose and behavior when None
  • Stored in _output_scan_policy following the class's naming convention

256-269: LGTM!

Policy application in scan_output is well-implemented:

  • Correctly placed after audit recording (findings preserved in audit before policy transformation)
  • Fail-safe error handling: logs the exception with policy name for debugging and returns raw scan result
  • Consistent with exception handling patterns elsewhere in the class (MemoryError, RecursionError re-raised)
src/ai_company/security/output_scan_policy.py (1)

19-23: No action required — TYPE_CHECKING pattern is correct for Python 3.14.

The code follows the coding guideline: "Do NOT use from __future__ import annotations — Python 3.14 has PEP 649 with native lazy annotations." Under PEP 649, annotations are evaluated lazily only when accessed via introspection (e.g., typing.get_type_hints()). Since this module and its tests perform no annotation introspection, the TYPE_CHECKING pattern for Mapping, EffectiveAutonomy, and SecurityContext is safe and correct. No changes needed.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a well-designed and robust system for pluggable output scan response policies. The addition of four distinct policies, a configuration-driven factory, and seamless integration into the SecOpsService significantly enhances the security capabilities of the framework. The error handling is commendable, particularly the fail-safe mechanism for policy application failures. The test coverage is comprehensive, covering unit, integration, and various edge cases, which ensures high confidence in the changes. I have one minor suggestion to improve logging consistency.

Note: Security Review did not run due to the size of the PR.

Comment on lines +248 to +257
logger.warning(
SECURITY_OUTPUT_SCAN_POLICY_APPLIED,
policy="autonomy_tiered",
autonomy_level=autonomy_level.value,
note=(
f"No policy mapped for autonomy level "
f"'{autonomy_level.value}' — falling back to "
f"'{self._fallback.name}'"
),
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with the structured logging approach used elsewhere in the codebase (e.g., in LogOnlyPolicy), it's better to avoid f-strings in log messages. Passing dynamic values as separate keyword arguments makes logs more easily machine-readable and queryable.

                logger.warning(
                    SECURITY_OUTPUT_SCAN_POLICY_APPLIED,
                    policy="autonomy_tiered",
                    autonomy_level=autonomy_level.value,
                    fallback_to=self._fallback.name,
                    note="No policy mapped for autonomy level — falling back",
                )

Add OutputScanResponsePolicy protocol with 4 concrete strategies
(RedactPolicy, WithholdPolicy, LogOnlyPolicy, AutonomyTieredPolicy)
for configurable per-autonomy output scan behavior. Wire policy into
SecOpsService.scan_output and add OutputScanPolicyType enum to
SecurityConfig. Fill 6 test gaps across invoker and service (31 new
tests total).
…nd docs

Pre-reviewed by 10 agents, 17 findings addressed:
- Add build_output_scan_policy() factory to wire config enum to runtime
- Wrap policy.apply() in try/except (fail-safe to raw scan result)
- Upgrade LogOnlyPolicy logging to WARNING when sensitive data found
- Add WARNING on AutonomyTieredPolicy fallback for unmapped levels
- Wrap _DEFAULT_AUTONOMY_POLICY_MAP in MappingProxyType (immutability)
- Add OutputScanPolicyType config tests, SUPERVISED level test, factory
  tests, policy-on-clean-result test, custom-map-fallback test
- Update DESIGN_SPEC.md §12.3 with policy docs and §15.3 structure
- Update CLAUDE.md package structure description
- Improve docstrings (WithholdPolicy findings, LogOnlyPolicy clarity,
  SecurityConfig attributes)
…opilot, and Greptile

- Add deepcopy + MappingProxyType for AutonomyTieredPolicy policy_map immutability
- Auto-wire output_scan_policy from SecurityConfig in SecOpsService
- Add import-time exhaustiveness check for AutonomyLevel ↔ policy map
- Use safe getattr for policy name access in scan_output error handling
- Update docstrings across service, config, factory, and __init__ modules
- Parametrize non-recoverable exception tests (MemoryError/RecursionError)
- Add audit-before-policy-clears test for LogOnlyPolicy integration
- Add unknown policy type factory test
- Add immutability assertion for WithholdPolicy
- Update README security feature description
@Aureliolo Aureliolo force-pushed the feat/output-scanner-integration branch from 548439c to 43ce881 Compare March 10, 2026 22:24
Mypy resolves _audit_log without the ignore — CI was failing on
unused-ignore.
Copilot AI review requested due to automatic review settings March 10, 2026 22:26
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/security/config.py`:
- Line 123: The default for SecurityConfig.output_scan_policy_type is set to
OutputScanPolicyType.REDACT which causes FULL and LOCKED to behave like REDACT
and hides the new autonomy-aware mapping; change the default to
OutputScanPolicyType.AUTONOMY_TIERED (or remove the hard default so the factory
selects the appropriate policy based on autonomy level) so that the
autonomy-tiered behavior is the default; update any related initialization in
the factory that reads SecurityConfig.output_scan_policy_type to expect
AUTONOMY_TIERED and ensure FULL and LOCKED retain their distinct behaviors when
that value is set.

In `@tests/unit/security/test_service.py`:
- Around line 847-849: Remove the unnecessary type-ignore on the access to the
SecOpsService audit log: delete the " # type: ignore[attr-defined]" from the
line that assigns audit_log = service._audit_log in
tests/unit/security/test_service.py (the access to service._audit_log is valid
because _audit_log is a real instance attribute of SecOpsService), leaving the
direct assignment and subsequent assertions unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0db01553-c21f-4e1e-9c39-61ee4fda15d8

📥 Commits

Reviewing files that changed from the base of the PR and between 548439c and 43ce881.

📒 Files selected for processing (14)
  • CLAUDE.md
  • DESIGN_SPEC.md
  • README.md
  • src/ai_company/observability/events/security.py
  • src/ai_company/security/__init__.py
  • src/ai_company/security/config.py
  • src/ai_company/security/output_scan_policy.py
  • src/ai_company/security/output_scan_policy_factory.py
  • src/ai_company/security/service.py
  • tests/unit/security/test_config.py
  • tests/unit/security/test_output_scan_policy.py
  • tests/unit/security/test_output_scan_policy_factory.py
  • tests/unit/security/test_service.py
  • tests/unit/tools/test_invoker_security.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Python 3.14+ (PEP 649 native lazy annotations) — do NOT use from __future__ import annotations
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14
All public functions and classes must have type hints — mypy strict mode is enforced
All public classes and functions must have Google-style docstrings — enforced by ruff D rules
Create new objects instead of mutating existing ones (immutability principle); for non-Pydantic internal collections, use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Separate frozen Pydantic models for config/identity from mutable-via-copy models for runtime state that evolves; never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use @computed_field for derived values instead of storing redundant fields; use NotBlankStr for all identifier/name fields instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls) over bare create_task for structured concurrency
Line length: 88 characters maximum — enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
All provider calls must go through BaseCompletionProvider which applies retry and rate limiting automatically — never implement retry logic in driver subclasses or calling code
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig; retryable errors (RateLimitError, ...

Files:

  • tests/unit/security/test_config.py
  • tests/unit/security/test_output_scan_policy.py
  • src/ai_company/security/output_scan_policy_factory.py
  • tests/unit/security/test_output_scan_policy_factory.py
  • tests/unit/security/test_service.py
  • src/ai_company/observability/events/security.py
  • src/ai_company/security/config.py
  • src/ai_company/security/service.py
  • src/ai_company/security/output_scan_policy.py
  • tests/unit/tools/test_invoker_security.py
  • src/ai_company/security/__init__.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow to categorize tests
Maintain 80% minimum code coverage — enforced in CI
Use asyncio_mode = "auto" in pytest configuration — no manual @pytest.mark.asyncio needed for async tests
Set test timeout to 30 seconds per test; use pytest-xdist via -n auto for parallel execution; prefer @pytest.mark.parametrize for testing similar cases

Files:

  • tests/unit/security/test_config.py
  • tests/unit/security/test_output_scan_policy.py
  • tests/unit/security/test_output_scan_policy_factory.py
  • tests/unit/security/test_service.py
  • tests/unit/tools/test_invoker_security.py
**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor names (Anthropic, OpenAI, Claude, GPT, etc.) may only appear in DESIGN_SPEC.md when listing supported providers (not in other markdown files)

Files:

  • README.md
  • CLAUDE.md
  • DESIGN_SPEC.md
DESIGN_SPEC.md

📄 CodeRabbit inference engine (CLAUDE.md)

ALWAYS read DESIGN_SPEC.md before implementing any feature or planning any issue — the design spec is the starting point for architecture, data models, and behavior

Files:

  • DESIGN_SPEC.md
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Every module with business logic MUST import logging from ai_company.observability with logger = get_logger(__name__) — never use import logging, logging.getLogger(), or print() in application code
Always use constants from domain-specific modules under ai_company.observability.events for event names (e.g., PROVIDER_CALL_START from events.provider), importing directly: from ai_company.observability.events.<domain> import EVENT_CONSTANT
Use structured logging kwargs: always logger.info(EVENT, key=value) — never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising; all state transitions must log at INFO; DEBUG for object creation, internal flow, entry/exit of key functions
Pure data models, enums, and re-exports do NOT need logging
Use absolute paths or run commands directly in Bash — never use cd in Bash commands; working directory is already set to project root

Files:

  • src/ai_company/security/output_scan_policy_factory.py
  • src/ai_company/observability/events/security.py
  • src/ai_company/security/config.py
  • src/ai_company/security/service.py
  • src/ai_company/security/output_scan_policy.py
  • src/ai_company/security/__init__.py
🧠 Learnings (5)
📚 Learning: 2026-03-10T22:19:11.971Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-10T22:19:11.971Z
Learning: Applies to .claude/** : Vendor names (Anthropic, OpenAI, Claude, GPT, etc.) may appear in `.claude/` skill and agent files

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-10T22:19:11.970Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-10T22:19:11.970Z
Learning: Applies to DESIGN_SPEC.md : ALWAYS read `DESIGN_SPEC.md` before implementing any feature or planning any issue — the design spec is the starting point for architecture, data models, and behavior

Applied to files:

  • DESIGN_SPEC.md
📚 Learning: 2026-03-10T22:19:11.971Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-10T22:19:11.971Z
Learning: Applies to **/*.py : Create new objects instead of mutating existing ones (immutability principle); for non-Pydantic internal collections, use `copy.deepcopy()` at construction and `MappingProxyType` wrapping for read-only enforcement

Applied to files:

  • src/ai_company/security/output_scan_policy.py
📚 Learning: 2026-03-10T22:19:11.971Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-10T22:19:11.971Z
Learning: Applies to **/*.py : For `dict`/`list` fields in frozen Pydantic models, rely on `frozen=True` for field reassignment prevention and `copy.deepcopy()` at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)

Applied to files:

  • src/ai_company/security/output_scan_policy.py
📚 Learning: 2026-03-10T22:19:11.971Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-10T22:19:11.971Z
Learning: Applies to src/**/*.py : Use structured logging kwargs: always `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`

Applied to files:

  • src/ai_company/security/output_scan_policy.py
🧬 Code graph analysis (9)
tests/unit/security/test_config.py (1)
src/ai_company/security/config.py (2)
  • OutputScanPolicyType (17-34)
  • SecurityConfig (91-138)
src/ai_company/security/output_scan_policy_factory.py (4)
src/ai_company/observability/_logger.py (1)
  • get_logger (8-28)
src/ai_company/security/config.py (1)
  • OutputScanPolicyType (17-34)
src/ai_company/security/output_scan_policy.py (4)
  • AutonomyTieredPolicy (206-282)
  • LogOnlyPolicy (130-178)
  • OutputScanResponsePolicy (30-56)
  • RedactPolicy (59-89)
src/ai_company/security/autonomy/models.py (1)
  • EffectiveAutonomy (154-191)
tests/unit/security/test_output_scan_policy_factory.py (5)
src/ai_company/core/enums.py (1)
  • AutonomyLevel (459-469)
src/ai_company/security/autonomy/models.py (1)
  • EffectiveAutonomy (154-191)
src/ai_company/security/config.py (1)
  • OutputScanPolicyType (17-34)
src/ai_company/security/output_scan_policy.py (9)
  • AutonomyTieredPolicy (206-282)
  • LogOnlyPolicy (130-178)
  • RedactPolicy (59-89)
  • WithholdPolicy (92-127)
  • name (38-40)
  • name (66-68)
  • name (102-104)
  • name (141-143)
  • name (235-237)
src/ai_company/security/output_scan_policy_factory.py (1)
  • build_output_scan_policy (25-73)
tests/unit/security/test_service.py (9)
src/ai_company/security/output_scan_policy.py (13)
  • WithholdPolicy (92-127)
  • OutputScanResponsePolicy (30-56)
  • apply (42-56)
  • apply (70-89)
  • apply (106-127)
  • apply (145-178)
  • apply (239-282)
  • name (38-40)
  • name (66-68)
  • name (102-104)
  • name (141-143)
  • name (235-237)
  • LogOnlyPolicy (130-178)
src/ai_company/security/output_scanner.py (2)
  • OutputScanner (31-70)
  • scan (34-70)
src/ai_company/security/rules/engine.py (2)
  • RuleEngine (32-174)
  • evaluate (74-151)
src/ai_company/security/service.py (2)
  • SecOpsService (74-456)
  • scan_output (223-291)
src/ai_company/security/models.py (1)
  • OutputScanResult (152-177)
src/ai_company/security/rules/data_leak_detector.py (2)
  • evaluate (83-108)
  • name (79-81)
src/ai_company/security/rules/policy_validator.py (2)
  • evaluate (57-98)
  • name (53-55)
src/ai_company/security/audit.py (4)
  • AuditLog (19-136)
  • record (49-69)
  • count (129-131)
  • entries (134-136)
tests/unit/security/test_output_scan_policy.py (1)
  • _make_context (22-30)
src/ai_company/security/config.py (2)
src/ai_company/core/enums.py (2)
  • ActionType (372-408)
  • ApprovalRiskLevel (443-449)
src/ai_company/security/models.py (1)
  • SecurityVerdictType (23-32)
src/ai_company/security/service.py (3)
src/ai_company/security/output_scan_policy.py (6)
  • OutputScanResponsePolicy (30-56)
  • apply (42-56)
  • apply (70-89)
  • apply (106-127)
  • apply (145-178)
  • apply (239-282)
src/ai_company/security/output_scan_policy_factory.py (1)
  • build_output_scan_policy (25-73)
src/ai_company/memory/errors.py (1)
  • MemoryError (13-14)
src/ai_company/security/output_scan_policy.py (3)
src/ai_company/core/enums.py (1)
  • AutonomyLevel (459-469)
src/ai_company/security/models.py (2)
  • OutputScanResult (152-177)
  • SecurityContext (70-100)
src/ai_company/security/autonomy/models.py (1)
  • EffectiveAutonomy (154-191)
tests/unit/tools/test_invoker_security.py (4)
src/ai_company/security/models.py (2)
  • SecurityVerdictType (23-32)
  • OutputScanResult (152-177)
src/ai_company/security/service.py (1)
  • scan_output (223-291)
src/ai_company/security/protocol.py (1)
  • scan_output (40-54)
src/ai_company/tools/invoker.py (2)
  • invoke (313-371)
  • registry (119-121)
src/ai_company/security/__init__.py (3)
src/ai_company/security/config.py (1)
  • OutputScanPolicyType (17-34)
src/ai_company/security/output_scan_policy.py (5)
  • AutonomyTieredPolicy (206-282)
  • LogOnlyPolicy (130-178)
  • OutputScanResponsePolicy (30-56)
  • RedactPolicy (59-89)
  • WithholdPolicy (92-127)
src/ai_company/security/output_scan_policy_factory.py (1)
  • build_output_scan_policy (25-73)
🪛 GitHub Actions: CI
tests/unit/security/test_service.py

[error] 847-847: mypy reported Unused 'type: ignore' comment [unused-ignore]. Command: 'uv run mypy src/ tests/'.

🪛 LanguageTool
README.md

[typographical] ~40-~40: To join two clauses or introduce examples, consider using an em dash.
Context: ...only - Security/Approval System (M7) - SecOps agent with rule engine (soft-allo...

(DASH_RULE)

🔇 Additional comments (10)
src/ai_company/security/output_scan_policy.py (1)

1-282: LGTM!

The implementation is well-structured with clear separation of concerns:

  • The OutputScanResponsePolicy protocol provides a clean contract
  • All four concrete policies (RedactPolicy, WithholdPolicy, LogOnlyPolicy, AutonomyTieredPolicy) implement the protocol correctly
  • The import-time validation (lines 194-203) ensures _DEFAULT_AUTONOMY_POLICY_MAP stays in sync with AutonomyLevel
  • The AutonomyTieredPolicy.__init__ now correctly freezes the policy map using copy.deepcopy() and MappingProxyType (lines 228-230), addressing the immutability requirement
src/ai_company/security/output_scan_policy_factory.py (1)

25-73: LGTM!

The factory function is well-implemented:

  • Exhaustive handling of all OutputScanPolicyType members via match statement
  • Appropriate warning when AUTONOMY_TIERED is used without effective_autonomy
  • Defensive error handling for unknown types with proper logging before raising TypeError
src/ai_company/security/__init__.py (1)

11-78: LGTM!

The public API surface is correctly expanded to expose:

  • OutputScanPolicyType enum for config-driven policy selection
  • OutputScanResponsePolicy protocol and all concrete implementations
  • build_output_scan_policy factory function

The __all__ list maintains alphabetical ordering and the module docstring is updated to document the new surface.

src/ai_company/security/service.py (2)

86-131: LGTM!

The __init__ signature extension follows the established pattern of optional dependency injection with sensible defaults. When output_scan_policy is None, the factory builds one from config.output_scan_policy_type — this provides both config-driven and programmatic flexibility.


272-290: LGTM!

The policy application logic is well-designed:

  • Applied after audit recording, so original findings are preserved in audit
  • Gracefully degrades to raw scan result on policy failure (scanner-level redaction still applies)
  • Correctly re-raises MemoryError and RecursionError as non-recoverable
  • Error logging includes policy name for debuggability
tests/unit/security/test_output_scan_policy_factory.py (1)

30-78: LGTM!

Comprehensive test coverage for the factory:

  • All four policy types tested with correct instance and name assertions
  • AUTONOMY_TIERED tested both with and without effective_autonomy
  • Verifies that effective_autonomy is ignored for non-tiered types
  • Includes the unknown policy type error path test (line 75-78)
tests/unit/security/test_config.py (1)

284-310: LGTM!

Good test coverage for OutputScanPolicyType:

  • Parametrized test ensures all enum members are accepted in SecurityConfig
  • Invalid string value correctly rejected with ValidationError
  • Enum values validated against expected strings
tests/unit/security/test_service.py (2)

612-665: LGTM!

Good coverage of the audit failure scenario — verifies that scan_output returns results even when audit recording fails, and that exceptions don't propagate.


668-822: LGTM!

Comprehensive policy integration tests:

  • Verifies policy.apply is called with correct arguments
  • Tests policy transformation via WithholdPolicy
  • Confirms policy is invoked on both sensitive and clean results
  • Tests default policy passthrough behavior
  • Covers both recoverable (returns raw result) and non-recoverable (MemoryError/RecursionError propagate) error paths
tests/unit/security/test_output_scan_policy.py (1)

1-256: LGTM!

Excellent test coverage for all policy implementations:

  • RedactPolicy: Verifies passthrough for both sensitive and clean results
  • WithholdPolicy: Tests content clearing, findings preservation, and immutability contract
  • LogOnlyPolicy: Confirms empty result returned regardless of input
  • AutonomyTieredPolicy: Covers all autonomy levels with default map, custom map override, and missing-level fallback
  • Protocol compliance: Parametrized test ensures all concrete classes satisfy OutputScanResponsePolicy

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/security/test_service.py`:
- Line 827: Import of LogOnlyPolicy is done locally instead of alongside other
top-level policy imports; move the "LogOnlyPolicy" import to the module-level
imports to match "WithholdPolicy" and other policies. Edit the test module to
remove the local "from ai_company.security.output_scan_policy import
LogOnlyPolicy" and add it to the existing block of top-level imports so all
policy classes are imported consistently.
- Around line 675-693: The helper _make_service_with_policy returns a
SecOpsService but doesn't attach the test-only attribute _test_audit_log like
_make_service does, causing later tests to access service._audit_log directly;
update _make_service_with_policy to mirror _make_service by setting the returned
SecOpsService's _test_audit_log to the AuditLog instance (audit_log) so tests
can use the same test attribute access pattern when interacting with
service._audit_log or related assertions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 251f1133-0efc-44c7-81b2-d0b8cbc5e688

📥 Commits

Reviewing files that changed from the base of the PR and between 43ce881 and 4a8faff.

📒 Files selected for processing (1)
  • tests/unit/security/test_service.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Agent
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Python 3.14+ (PEP 649 native lazy annotations) — do NOT use from __future__ import annotations
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14
All public functions and classes must have type hints — mypy strict mode is enforced
All public classes and functions must have Google-style docstrings — enforced by ruff D rules
Create new objects instead of mutating existing ones (immutability principle); for non-Pydantic internal collections, use copy.deepcopy() at construction and MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Separate frozen Pydantic models for config/identity from mutable-via-copy models for runtime state that evolves; never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict); use @computed_field for derived values instead of storing redundant fields; use NotBlankStr for all identifier/name fields instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls) over bare create_task for structured concurrency
Line length: 88 characters maximum — enforced by ruff
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly, never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)
All provider calls must go through BaseCompletionProvider which applies retry and rate limiting automatically — never implement retry logic in driver subclasses or calling code
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig; retryable errors (RateLimitError, ...

Files:

  • tests/unit/security/test_service.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow to categorize tests
Maintain 80% minimum code coverage — enforced in CI
Use asyncio_mode = "auto" in pytest configuration — no manual @pytest.mark.asyncio needed for async tests
Set test timeout to 30 seconds per test; use pytest-xdist via -n auto for parallel execution; prefer @pytest.mark.parametrize for testing similar cases

Files:

  • tests/unit/security/test_service.py
🔇 Additional comments (4)
tests/unit/security/test_service.py (4)

29-31: LGTM!

The TYPE_CHECKING guard for OutputScanResponsePolicy is the correct pattern—it avoids runtime import overhead while providing type hints for the test helper signature at line 679.


615-666: LGTM!

The tests correctly verify that scan_output fails safe when audit recording raises, returning the scan result without propagating the exception. This aligns with the PR objective's error handling that "fails safe to the raw scan result."


793-821: LGTM!

The parametrized test correctly verifies that non-recoverable errors (MemoryError, RecursionError) propagate as specified in the PR objectives. This aligns with the stated behavior: "MemoryError/RecursionError are re-raised."


823-850: LGTM!

This test correctly verifies that audit entries capture original findings before the policy transforms the result. The LogOnlyPolicy clears findings in the returned result (has_sensitive_data=False), but the audit entry preserves the original "Bearer token" finding. This ensures forensic traceability as mentioned in the PR objectives.

The previous review comment about the unused type: ignore on line 847 has been properly addressed.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +228 to +231
raw = policy_map if policy_map is not None else _DEFAULT_AUTONOMY_POLICY_MAP
self._policy_map: Mapping[AutonomyLevel, OutputScanResponsePolicy] = (
MappingProxyType(copy.deepcopy(dict(raw)))
)
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutonomyTieredPolicy deep-copies policy_map values (copy.deepcopy(dict(raw))). This is unnecessary for immutability (a shallow dict(raw) already decouples from later mutations) and can break callers that pass non-deepcopyable policy instances (e.g., mocks, policies holding locks/resources), raising at runtime during policy construction. Prefer MappingProxyType(dict(raw)) (or a shallow copy) and document that policy instances are treated as immutable/stateless.

Copilot uses AI. Check for mistakes.
)
if not scan_result.has_sensitive_data:
return scan_result
return scan_result.model_copy(update={"redacted_content": None})
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WithholdPolicy forces redacted_content=None to trigger ToolInvoker's fail-closed branch, but ToolInvoker currently treats this as "no redacted content available" and logs/metrics that message. That becomes misleading when redaction was available but intentionally withheld by policy. Consider adding an explicit signal (e.g., a withheld flag on OutputScanResult or a distinct field/enum for outcome) and updating the invoker to log/report the correct reason when output is withheld by policy.

Suggested change
return scan_result.model_copy(update={"redacted_content": None})
return scan_result.model_copy(
update={
"redacted_content": None,
"withheld": True,
}
)

Copilot uses AI. Check for mistakes.
README.md Outdated
- **Memory Backend Adapter (M5)** - Memory protocols, retrieval pipeline, org memory, and consolidation are complete; initial Mem0 adapter backend ([ADR-001](docs/decisions/ADR-001-memory-layer.md)) pending; research backends (GraphRAG, Temporal KG) planned
- **CLI Surface** - `cli/` package is placeholder-only
- **Security/Approval System (M7)** - SecOps agent with rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, risk classifier, and ToolInvoker integration are implemented; progressive trust (4 strategies), promotion/demotion, autonomy levels (5 tiers with presets, resolver, change strategies) and approval timeout policies (wait-forever, auto-deny, tiered, escalation-chain with task park/resume) are implemented; real authentication (JWT/OAuth) and approval workflow gates are planned
- **Security/Approval System (M7)** - SecOps agent with rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, and ToolInvoker integration are implemented; progressive trust (4 strategies), promotion/demotion, autonomy levels (5 tiers with presets, resolver, change strategies) and approval timeout policies (wait-forever, auto-deny, tiered, escalation-chain with task park/resume) are implemented; real authentication (JWT/OAuth) and approval workflow gates are planned
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bullet is under "Not implemented yet" but states the Security/Approval System features "are implemented". Either move it to the Implemented section (or reword it as planned/in-progress) to avoid contradictory project status documentation.

Suggested change
- **Security/Approval System (M7)** - SecOps agent with rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, and ToolInvoker integration are implemented; progressive trust (4 strategies), promotion/demotion, autonomy levels (5 tiers with presets, resolver, change strategies) and approval timeout policies (wait-forever, auto-deny, tiered, escalation-chain with task park/resume) are implemented; real authentication (JWT/OAuth) and approval workflow gates are planned
- **Security/Approval System (M7)** - SecOps agent with rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, and ToolInvoker integration are partially implemented and under active development; progressive trust (4 strategies), promotion/demotion, autonomy levels (5 tiers with presets, resolver, change strategies) and approval timeout policies (wait-forever, auto-deny, tiered, escalation-chain with task park/resume) are partially implemented and continuing to evolve; real authentication (JWT/OAuth) and approval workflow gates are planned

Copilot uses AI. Check for mistakes.
- Change default output_scan_policy_type from REDACT to AUTONOMY_TIERED
  so autonomy-aware policy routing is active by default (CodeRabbit)
- Replace deepcopy with shallow copy + MappingProxyType in
  AutonomyTieredPolicy to avoid breaking non-copyable policies (Copilot)
- Document stateless/immutable expectation on OutputScanResponsePolicy
  protocol docstring (Greptile)
- Add comment explaining intentional import-time integrity guard on
  _DEFAULT_AUTONOMY_POLICY_MAP (Greptile)
- Add unit test asserting default map covers all AutonomyLevel members
- Fix f-string in AutonomyTieredPolicy log to use structured kwargs (Gemini)
- Move LogOnlyPolicy import to top-level in test_service.py (CodeRabbit)
- Align _make_service_with_policy helper to attach _test_audit_log (CodeRabbit)
- Fix README contradictory status: move security items to Implemented (Copilot)
@Aureliolo Aureliolo merged commit b9907e8 into main Mar 10, 2026
9 of 10 checks passed
@Aureliolo Aureliolo deleted the feat/output-scanner-integration branch March 10, 2026 22:51
Comment on lines +280 to +281
except MemoryError, RecursionError:
raise
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python 3 SyntaxError: invalid multi-exception catch syntax

except MemoryError, RecursionError: is the old Python 2 binding syntax (except ExcType, varname:), which was explicitly removed in Python 3 (PEP 3110). In Python 3, this is a SyntaxError, meaning the entire service.py module will fail to import. Parentheses are required to catch multiple exception types:

Suggested change
except MemoryError, RecursionError:
raise
except (MemoryError, RecursionError):

Note: the same invalid pattern appears at lines 179, 263, 381, and 433 in this file — those are pre-existing occurrences with the same problem and should be fixed as well.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/security/service.py
Line: 280-281

Comment:
**Python 3 SyntaxError: invalid multi-exception catch syntax**

`except MemoryError, RecursionError:` is the old Python 2 binding syntax (`except ExcType, varname:`), which was explicitly removed in Python 3 (PEP 3110). In Python 3, this is a `SyntaxError`, meaning the entire `service.py` module will fail to import. Parentheses are required to catch multiple exception types:

```suggestion
        except (MemoryError, RecursionError):
```

Note: the same invalid pattern appears at lines 179, 263, 381, and 433 in this file — those are pre-existing occurrences with the same problem and should be fixed as well.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: integrate OutputScanner into ToolInvoker execution flow

2 participants