Skip to content

Commit 548439c

Browse files
committed
refactor: harden output scan policies with factory, error handling, and docs
Pre-reviewed by 10 agents, 17 findings addressed: - Add build_output_scan_policy() factory to wire config enum to runtime - Wrap policy.apply() in try/except (fail-safe to raw scan result) - Upgrade LogOnlyPolicy logging to WARNING when sensitive data found - Add WARNING on AutonomyTieredPolicy fallback for unmapped levels - Wrap _DEFAULT_AUTONOMY_POLICY_MAP in MappingProxyType (immutability) - Add OutputScanPolicyType config tests, SUPERVISED level test, factory tests, policy-on-clean-result test, custom-map-fallback test - Update DESIGN_SPEC.md §12.3 with policy docs and §15.3 structure - Update CLAUDE.md package structure description - Improve docstrings (WithholdPolicy findings, LogOnlyPolicy clarity, SecurityConfig attributes)
1 parent 7a610dc commit 548439c

11 files changed

Lines changed: 359 additions & 29 deletions

CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ src/ai_company/
7777
persistence/ # Operational data persistence — pluggable PersistenceBackend protocol, SQLite initial (§7.6)
7878
observability/ # Structured logging, correlation tracking, log sinks
7979
providers/ # LLM provider abstraction (LiteLLM adapter)
80-
security/ # SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
80+
security/ # SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)
8181
templates/ # Pre-built company templates, personality presets, and builder
8282
tools/ # Tool registry, built-in tools (file_system/, git, sandbox/, code_runner), MCP bridge (mcp/), role-based access
8383
```

DESIGN_SPEC.md

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ The MVP validates the core hypothesis: **a single agent can complete a real task
8080
> **How to read this spec:** Sections describe the full vision. Each section with deferred features includes an **MVP** callout box indicating what ships in M3 and what is deferred. The full design is documented upfront to inform architecture decisions — protocol interfaces are designed even for features that won't be built until later milestones.
8181
8282
> **Implementation snapshot (2026-03-10):**
83-
> - **Done:** M0–M6 (tooling, config/core, providers, single-agent engine, multi-agent orchestration, API/CLI surface) + Docker sandbox (#50), MCP bridge (#53), code runner + HR engine (hiring/firing/onboarding/offboarding/registry) + performance tracking (task metrics, quality scoring, collaboration scoring, trend detection, rolling windows). Memory layer backend selected ([ADR-001](docs/decisions/ADR-001-memory-layer.md)). Persistence backend (§7.6) completed. Memory retrieval pipeline (#41: ranking, token-budget formatting, context injection, non-inferable filtering) complete. Budget enforcement complete (BudgetEnforcer + configurable cost tiers + quota/subscription tracking). CFO cost optimization complete (CostOptimizer: anomaly detection, efficiency analysis, downgrade recommendations, routing optimization, approval decisions; ReportGenerator: multi-dimensional spending reports). Shared org memory (#125: HybridPromptRetrievalBackend, OrgFactStore, access control, factory) complete. Memory consolidation/archival (#48: ConsolidationService, SimpleConsolidationStrategy, RetentionEnforcer, ArchivalStore protocol) complete. SecOps agent (rule engine, audit log, output scanner, risk classifier, ToolInvoker integration), progressive trust (4 strategies: disabled/weighted/per-category/milestone behind TrustStrategy protocol), promotion/demotion (criteria evaluation, approval strategies, model mapping). Autonomy levels (#42: AutonomyLevel enum, presets, 3-level resolver, rule-based auto-downgrade/human-only promotion change strategy) + approval timeout policies (#126: 4 timeout policies, park/resume service, risk tier classifier, timeout checker) complete.
83+
> - **Done:** M0–M6 (tooling, config/core, providers, single-agent engine, multi-agent orchestration, API/CLI surface) + Docker sandbox (#50), MCP bridge (#53), code runner + HR engine (hiring/firing/onboarding/offboarding/registry) + performance tracking (task metrics, quality scoring, collaboration scoring, trend detection, rolling windows). Memory layer backend selected ([ADR-001](docs/decisions/ADR-001-memory-layer.md)). Persistence backend (§7.6) completed. Memory retrieval pipeline (#41: ranking, token-budget formatting, context injection, non-inferable filtering) complete. Budget enforcement complete (BudgetEnforcer + configurable cost tiers + quota/subscription tracking). CFO cost optimization complete (CostOptimizer: anomaly detection, efficiency analysis, downgrade recommendations, routing optimization, approval decisions; ReportGenerator: multi-dimensional spending reports). Shared org memory (#125: HybridPromptRetrievalBackend, OrgFactStore, access control, factory) complete. Memory consolidation/archival (#48: ConsolidationService, SimpleConsolidationStrategy, RetentionEnforcer, ArchivalStore protocol) complete. SecOps agent (rule engine, audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, ToolInvoker integration), progressive trust (4 strategies: disabled/weighted/per-category/milestone behind TrustStrategy protocol), promotion/demotion (criteria evaluation, approval strategies, model mapping). Autonomy levels (#42: AutonomyLevel enum, presets, 3-level resolver, rule-based auto-downgrade/human-only promotion change strategy) + approval timeout policies (#126: 4 timeout policies, park/resume service, risk tier classifier, timeout checker) complete.
8484
> - **Remaining:** JWT/OAuth auth, approval workflow gates.
8585
8686
### 1.5 Configuration Philosophy
@@ -2429,6 +2429,19 @@ A special meta-agent that reviews all actions before execution:
24292429
> - **D4 — LLM vs Rule-based:** Hybrid approach. Rule engine for known patterns (credentials, path traversal, destructive ops) — sub-ms, covers ~95% of cases. LLM fallback only for uncertain cases (~5%). Full autonomy mode: rules + audit logging only, no LLM path. Hard safety rules (credential exposure, data destruction) **never bypass** regardless of autonomy level. Precedent: AWS GuardDuty, LlamaFirewall, NeMo Guardrails all use hybrid.
24302430
> - **D5 — Integration Point:** Pluggable `SecurityInterceptionStrategy` protocol. Initial: before every tool invocation — slots into existing `ToolInvoker` between permission check and tool execution. Policy strictness (not interception point) configurable per autonomy level. Add post-tool-call scanning for sensitive data in outputs. Performance: sub-ms rule check is invisible against seconds of LLM inference. Future strategies: batch-level (before task step), assignment-only.
24312431

2432+
#### Output Scan Response Policies
2433+
2434+
After the output scanner detects sensitive data, a pluggable **`OutputScanResponsePolicy`** protocol decides how to handle the findings. Four built-in policies ship behind the protocol:
2435+
2436+
| Policy | Behavior | Default for |
2437+
|--------|----------|-------------|
2438+
| **Redact** (default) | Return scanner's redacted content as-is | `SEMI`, `SUPERVISED` autonomy |
2439+
| **Withhold** | Clear redacted content — fail-closed, no partial data returned | `LOCKED` autonomy |
2440+
| **Log-only** | Discard findings (logs at WARNING), pass original output through | `FULL` autonomy |
2441+
| **Autonomy-tiered** | Delegate to a sub-policy based on effective autonomy level | Composite policy |
2442+
2443+
Policy selection is declarative via `SecurityConfig.output_scan_policy_type` (`OutputScanPolicyType` enum). A factory function (`build_output_scan_policy`) resolves the enum to a concrete policy instance. Runtime constructor injection on `SecOpsService` is also supported for full flexibility. The policy is applied *after* audit recording, preserving audit fidelity regardless of policy outcome.
2444+
24322445
### 12.4 Approval Timeout Policy
24332446

24342447
When an action requires human approval (per autonomy level in §12.2), the agent must wait. The framework provides configurable timeout policies that determine what happens when a human doesn't respond. All policies implement a `TimeoutPolicy` protocol. The policy is configurable per autonomy level and per action risk tier.
@@ -3099,8 +3112,10 @@ ai-company/
30993112
│ │ ├── action_type_mapping.py # Default ToolCategory → ActionType mapping
31003113
│ │ ├── action_types.py # ActionTypeCategory registry and validation
31013114
│ │ ├── audit.py # Append-only AuditLog with configurable eviction
3102-
│ │ ├── config.py # SecurityConfig, SecurityPolicyRule, RuleEngineConfig
3115+
│ │ ├── config.py # SecurityConfig, SecurityPolicyRule, RuleEngineConfig, OutputScanPolicyType
31033116
│ │ ├── models.py # SecurityVerdict, SecurityContext, AuditEntry, OutputScanResult
3117+
│ │ ├── output_scan_policy.py # Output scan response policies (redact/withhold/log-only/autonomy-tiered)
3118+
│ │ ├── output_scan_policy_factory.py # build_output_scan_policy() factory
31043119
│ │ ├── output_scanner.py # Post-tool output scanning (regex-based redaction)
31053120
│ │ ├── protocol.py # SecurityInterceptionStrategy protocol
31063121
│ │ ├── service.py # SecOpsService — meta-agent coordinating security

src/ai_company/security/__init__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,9 @@
3838
RedactPolicy,
3939
WithholdPolicy,
4040
)
41+
from ai_company.security.output_scan_policy_factory import (
42+
build_output_scan_policy,
43+
)
4144
from ai_company.security.output_scanner import OutputScanner
4245
from ai_company.security.protocol import SecurityInterceptionStrategy
4346
from ai_company.security.rules.engine import RuleEngine
@@ -67,4 +70,5 @@
6770
"SecurityVerdict",
6871
"SecurityVerdictType",
6972
"WithholdPolicy",
73+
"build_output_scan_policy",
7074
]

src/ai_company/security/config.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,8 @@ class SecurityConfig(BaseModel):
9898
post_tool_scanning_enabled: Scan tool output for secrets.
9999
hard_deny_action_types: Action types always denied.
100100
auto_approve_action_types: Action types always approved.
101+
output_scan_policy_type: Output scan response policy
102+
(default: ``REDACT``).
101103
custom_policies: User-defined policy rules.
102104
"""
103105

src/ai_company/security/output_scan_policy.py

Lines changed: 61 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
autonomy level.
77
"""
88

9+
from types import MappingProxyType
910
from typing import TYPE_CHECKING, Protocol, runtime_checkable
1011

1112
from ai_company.core.enums import AutonomyLevel
@@ -91,6 +92,9 @@ class WithholdPolicy:
9192
"""Clear redacted content when sensitive data is found.
9293
9394
Forces fail-closed in the invoker — no partial data is returned.
95+
The ``findings`` tuple is deliberately preserved so that audit
96+
consumers can categorise what was detected without seeing the
97+
actual content.
9498
"""
9599

96100
@property
@@ -123,10 +127,13 @@ def apply(
123127

124128

125129
class LogOnlyPolicy:
126-
"""Return an empty result — findings are logged but output passes through.
130+
"""Discard scan findings, returning a clean result.
127131
132+
The caller should treat the original tool output as unmodified.
128133
Suitable for audit-only mode or high-trust agents where output
129-
scanning is informational rather than enforced.
134+
scanning is informational rather than enforced. The audit entry
135+
written by ``SecOpsService.scan_output`` before this policy runs
136+
preserves the original findings.
130137
"""
131138

132139
@property
@@ -137,32 +144,50 @@ def name(self) -> str:
137144
def apply(
138145
self,
139146
scan_result: OutputScanResult,
140-
context: SecurityContext, # noqa: ARG002
147+
context: SecurityContext,
141148
) -> OutputScanResult:
142-
"""Return empty result regardless of findings.
149+
"""Return a clean ``OutputScanResult`` regardless of findings.
150+
151+
Suppresses enforcement while preserving the audit log entry
152+
written by ``SecOpsService.scan_output``.
143153
144154
Args:
145155
scan_result: Result from the output scanner.
146-
context: Security context (unused).
156+
context: Security context of the tool invocation.
147157
148158
Returns:
149-
Empty ``OutputScanResult``.
159+
Clean ``OutputScanResult`` with ``has_sensitive_data=False``.
150160
"""
151-
logger.debug(
152-
SECURITY_OUTPUT_SCAN_POLICY_APPLIED,
153-
policy="log_only",
154-
has_sensitive_data=scan_result.has_sensitive_data,
155-
)
161+
if scan_result.has_sensitive_data:
162+
logger.warning(
163+
SECURITY_OUTPUT_SCAN_POLICY_APPLIED,
164+
policy="log_only",
165+
has_sensitive_data=True,
166+
findings=scan_result.findings,
167+
tool_name=context.tool_name,
168+
agent_id=context.agent_id,
169+
note="Sensitive data detected but passed through by log_only policy",
170+
)
171+
else:
172+
logger.debug(
173+
SECURITY_OUTPUT_SCAN_POLICY_APPLIED,
174+
policy="log_only",
175+
has_sensitive_data=False,
176+
)
156177
return OutputScanResult()
157178

158179

159-
# Default autonomy-to-policy mapping.
160-
_DEFAULT_AUTONOMY_POLICY_MAP: dict[AutonomyLevel, OutputScanResponsePolicy] = {
161-
AutonomyLevel.FULL: LogOnlyPolicy(),
162-
AutonomyLevel.SEMI: RedactPolicy(),
163-
AutonomyLevel.SUPERVISED: RedactPolicy(),
164-
AutonomyLevel.LOCKED: WithholdPolicy(),
165-
}
180+
# Default autonomy-to-policy mapping (read-only).
181+
_DEFAULT_AUTONOMY_POLICY_MAP: Mapping[AutonomyLevel, OutputScanResponsePolicy] = (
182+
MappingProxyType(
183+
{
184+
AutonomyLevel.FULL: LogOnlyPolicy(),
185+
AutonomyLevel.SEMI: RedactPolicy(),
186+
AutonomyLevel.SUPERVISED: RedactPolicy(),
187+
AutonomyLevel.LOCKED: WithholdPolicy(),
188+
}
189+
)
190+
)
166191

167192

168193
class AutonomyTieredPolicy:
@@ -212,18 +237,31 @@ def apply(
212237
"""
213238
if self._effective_autonomy is None:
214239
delegate = self._fallback
240+
autonomy_level = None
215241
else:
216-
level = self._effective_autonomy.level
217-
delegate = self._policy_map.get(level, self._fallback)
242+
autonomy_level = self._effective_autonomy.level
243+
mapped = self._policy_map.get(autonomy_level)
244+
if mapped is not None:
245+
delegate = mapped
246+
else:
247+
delegate = self._fallback
248+
logger.warning(
249+
SECURITY_OUTPUT_SCAN_POLICY_APPLIED,
250+
policy="autonomy_tiered",
251+
autonomy_level=autonomy_level.value,
252+
note=(
253+
f"No policy mapped for autonomy level "
254+
f"'{autonomy_level.value}' — falling back to "
255+
f"'{self._fallback.name}'"
256+
),
257+
)
218258

219259
logger.debug(
220260
SECURITY_OUTPUT_SCAN_POLICY_APPLIED,
221261
policy="autonomy_tiered",
222262
delegate=delegate.name,
223263
autonomy_level=(
224-
self._effective_autonomy.level.value
225-
if self._effective_autonomy is not None
226-
else None
264+
autonomy_level.value if autonomy_level is not None else None
227265
),
228266
)
229267
return delegate.apply(scan_result, context)
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
"""Factory for creating output scan policy instances from configuration."""
2+
3+
from typing import TYPE_CHECKING
4+
5+
from ai_company.observability import get_logger
6+
from ai_company.observability.events.security import (
7+
SECURITY_OUTPUT_SCAN_POLICY_APPLIED,
8+
)
9+
from ai_company.security.config import OutputScanPolicyType
10+
from ai_company.security.output_scan_policy import (
11+
AutonomyTieredPolicy,
12+
LogOnlyPolicy,
13+
OutputScanResponsePolicy,
14+
RedactPolicy,
15+
WithholdPolicy,
16+
)
17+
18+
if TYPE_CHECKING:
19+
from ai_company.security.autonomy.models import EffectiveAutonomy
20+
21+
logger = get_logger(__name__)
22+
23+
24+
def build_output_scan_policy(
25+
policy_type: OutputScanPolicyType,
26+
*,
27+
effective_autonomy: EffectiveAutonomy | None = None,
28+
) -> OutputScanResponsePolicy:
29+
"""Create an output scan policy from its config enum value.
30+
31+
Args:
32+
policy_type: Declarative policy selection from config.
33+
effective_autonomy: Resolved autonomy for the current run.
34+
Required when ``policy_type`` is ``AUTONOMY_TIERED``;
35+
ignored otherwise.
36+
37+
Returns:
38+
A configured output scan response policy instance.
39+
40+
Raises:
41+
TypeError: If ``policy_type`` is not a recognized enum member.
42+
"""
43+
match policy_type:
44+
case OutputScanPolicyType.REDACT:
45+
return RedactPolicy()
46+
case OutputScanPolicyType.WITHHOLD:
47+
return WithholdPolicy()
48+
case OutputScanPolicyType.LOG_ONLY:
49+
return LogOnlyPolicy()
50+
case OutputScanPolicyType.AUTONOMY_TIERED:
51+
if effective_autonomy is None:
52+
logger.warning(
53+
SECURITY_OUTPUT_SCAN_POLICY_APPLIED,
54+
policy_type=policy_type.value,
55+
note="output_scan_policy_type=autonomy_tiered "
56+
"but no effective_autonomy — "
57+
"AutonomyTieredPolicy will fall back to "
58+
"RedactPolicy",
59+
)
60+
return AutonomyTieredPolicy(
61+
effective_autonomy=effective_autonomy,
62+
)
63+
64+
msg = f"Unknown output scan policy type: {policy_type!r}" # type: ignore[unreachable]
65+
logger.warning(
66+
SECURITY_OUTPUT_SCAN_POLICY_APPLIED,
67+
policy_type=str(policy_type),
68+
note="Unknown output scan policy type",
69+
)
70+
raise TypeError(msg)

src/ai_company/security/service.py

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -254,7 +254,18 @@ async def scan_output(
254254
)
255255

256256
if self._output_scan_policy is not None:
257-
result = self._output_scan_policy.apply(result, context)
257+
try:
258+
result = self._output_scan_policy.apply(result, context)
259+
except MemoryError, RecursionError:
260+
raise
261+
except Exception:
262+
logger.exception(
263+
SECURITY_INTERCEPTOR_ERROR,
264+
tool_name=context.tool_name,
265+
policy=self._output_scan_policy.name,
266+
note="Output scan policy application failed "
267+
"— returning raw scan result",
268+
)
258269

259270
return result
260271

tests/unit/security/test_config.py

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55

66
from ai_company.core.enums import ApprovalRiskLevel
77
from ai_company.security.config import (
8+
OutputScanPolicyType,
89
RuleEngineConfig,
910
SecurityConfig,
1011
SecurityPolicyRule,
@@ -169,6 +170,7 @@ def test_defaults(self) -> None:
169170
"code:read",
170171
"docs:write",
171172
)
173+
assert cfg.output_scan_policy_type == OutputScanPolicyType.REDACT
172174
assert cfg.custom_policies == ()
173175

174176
def test_disabled_state(self) -> None:
@@ -270,8 +272,39 @@ def test_json_roundtrip(self) -> None:
270272
post_tool_scanning_enabled=False,
271273
hard_deny_action_types=("org:fire",),
272274
auto_approve_action_types=(),
275+
output_scan_policy_type=OutputScanPolicyType.WITHHOLD,
273276
custom_policies=(policy,),
274277
)
275278
json_str = cfg.model_dump_json()
276279
restored = SecurityConfig.model_validate_json(json_str)
277280
assert restored == cfg
281+
assert restored.output_scan_policy_type == OutputScanPolicyType.WITHHOLD
282+
283+
284+
# ── OutputScanPolicyType ─────────────────────────────────────────
285+
286+
287+
@pytest.mark.unit
288+
class TestOutputScanPolicyType:
289+
"""Tests for OutputScanPolicyType enum values and config integration."""
290+
291+
@pytest.mark.parametrize(
292+
"policy_type",
293+
list(OutputScanPolicyType),
294+
)
295+
def test_all_policy_types_accepted_in_config(
296+
self,
297+
policy_type: OutputScanPolicyType,
298+
) -> None:
299+
cfg = SecurityConfig(output_scan_policy_type=policy_type)
300+
assert cfg.output_scan_policy_type == policy_type
301+
302+
def test_invalid_policy_type_rejected(self) -> None:
303+
with pytest.raises(ValidationError):
304+
SecurityConfig(output_scan_policy_type="nonexistent") # type: ignore[arg-type]
305+
306+
def test_enum_values(self) -> None:
307+
assert OutputScanPolicyType.REDACT.value == "redact"
308+
assert OutputScanPolicyType.WITHHOLD.value == "withhold"
309+
assert OutputScanPolicyType.LOG_ONLY.value == "log_only"
310+
assert OutputScanPolicyType.AUTONOMY_TIERED.value == "autonomy_tiered"

0 commit comments

Comments
 (0)