feat: add autonomy levels and approval timeout policies (#42, #126) by Aureliolo · Pull Request #197 · Aureliolo/synthorg

Aureliolo · 2026-03-10T11:59:54Z

Summary

Autonomy levels: Three-level resolution chain (agent → department → company default) with AutonomyResolver, EffectiveAutonomy model, action type expansion via ActionTypeRegistry, seniority constraints (JUNIOR can't have FULL), and HumanOnlyPromotionStrategy for auto-downgrade
Approval timeout policies: Four pluggable TimeoutPolicy implementations (Wait Forever, Deny on Timeout, Tiered per risk level, Escalation Chain) with discriminated union config, DefaultRiskTierClassifier, and TimeoutChecker service
Task park/resume: ParkService serializes AgentContext to ParkedContext for persistence when awaiting approval; PARKED termination reason in ExecutionResult
Persistence: SQLite v3 migration adds parked_contexts table with indexes; SQLiteParkedContextRepository with full CRUD
API: AutonomyController (GET/POST /agents/{agent_id}/autonomy) registered with app router
Security integration: Autonomy pre-check in SecOpsService, security guard blocking auto-approve for HIGH/CRITICAL risk tiers

Pre-PR review fixes (51 findings from 10 agents)

Fix autonomy controller returning requested level instead of current
Add disjoint validator on EffectiveAutonomy action sets
Add escalate_to consistency validator on TimeoutAction
Add MemoryError/RecursionError re-raise in security service broad except blocks
Fix _row_to_model to raise QueryError instead of returning None
Rename YamlRiskTierClassifier → DefaultRiskTierClassifier
Move Jinja2 env to module-level singleton in template renderer
Fix personality mutation pattern (return instead of mutate)
Fix immutability violations (deepcopy metadata, immutable dict patterns)
Enumerate columns explicitly in SELECT queries (no SELECT *)
Register AutonomyController with app router

Closes #42
Closes #126

Test plan

6396 tests pass (0 failures), 95% coverage
New tests: test_parked_context_repo.py (12 tests), test_autonomy.py (4 tests)
Edge case tests: negative timeout, zero timeout, disjoint overlap, escalate_to validator, double-downgrade, v3 migration
Event constants verified in test_events.py (autonomy, timeout, parked context persistence)
Rename propagated: DefaultRiskTierClassifier in all test files
ruff, mypy strict, pre-commit hooks all pass
Rebased on latest main, conflict resolved in prompt.py

github-actions · 2026-03-10T12:00:05Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

coderabbitai · 2026-03-10T12:00:15Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a46085ec-9651-4a88-a35a-f6acdd55701c

📥 Commits

Reviewing files that changed from the base of the PR and between 1855156 and 4deae2a.

📒 Files selected for processing (63)

CLAUDE.md
DESIGN_SPEC.md
README.md
src/ai_company/api/controllers/__init__.py
src/ai_company/api/controllers/autonomy.py
src/ai_company/config/schema.py
src/ai_company/core/agent.py
src/ai_company/core/company.py
src/ai_company/core/enums.py
src/ai_company/engine/agent_engine.py
src/ai_company/engine/loop_protocol.py
src/ai_company/engine/prompt.py
src/ai_company/engine/prompt_template.py
src/ai_company/observability/events/autonomy.py
src/ai_company/observability/events/persistence.py
src/ai_company/observability/events/timeout.py
src/ai_company/persistence/protocol.py
src/ai_company/persistence/repositories.py
src/ai_company/persistence/sqlite/backend.py
src/ai_company/persistence/sqlite/migrations.py
src/ai_company/persistence/sqlite/parked_context_repo.py
src/ai_company/security/autonomy/__init__.py
src/ai_company/security/autonomy/change_strategy.py
src/ai_company/security/autonomy/models.py
src/ai_company/security/autonomy/protocol.py
src/ai_company/security/autonomy/resolver.py
src/ai_company/security/service.py
src/ai_company/security/timeout/__init__.py
src/ai_company/security/timeout/config.py
src/ai_company/security/timeout/factory.py
src/ai_company/security/timeout/models.py
src/ai_company/security/timeout/park_service.py
src/ai_company/security/timeout/parked_context.py
src/ai_company/security/timeout/policies.py
src/ai_company/security/timeout/protocol.py
src/ai_company/security/timeout/risk_tier_classifier.py
src/ai_company/security/timeout/timeout_checker.py
src/ai_company/templates/renderer.py
tests/unit/api/conftest.py
tests/unit/api/controllers/test_autonomy.py
tests/unit/core/conftest.py
tests/unit/core/test_company.py
tests/unit/engine/test_loop_protocol.py
tests/unit/engine/test_prompt.py
tests/unit/memory/org/test_prompt_integration.py
tests/unit/observability/test_events.py
tests/unit/persistence/sqlite/test_migrations.py
tests/unit/persistence/sqlite/test_parked_context_repo.py
tests/unit/persistence/test_migrations_v2.py
tests/unit/persistence/test_protocol.py
tests/unit/security/autonomy/__init__.py
tests/unit/security/autonomy/test_change_strategy.py
tests/unit/security/autonomy/test_models.py
tests/unit/security/autonomy/test_resolver.py
tests/unit/security/test_service.py
tests/unit/security/timeout/__init__.py
tests/unit/security/timeout/test_config.py
tests/unit/security/timeout/test_factory.py
tests/unit/security/timeout/test_park_service.py
tests/unit/security/timeout/test_parked_context.py
tests/unit/security/timeout/test_policies.py
tests/unit/security/timeout/test_risk_tier_classifier.py
tests/unit/security/timeout/test_timeout_checker.py

📝 Walkthrough

Summary by CodeRabbit

New Features
- Agent autonomy subsystem with configurable levels and presets
- API endpoints to view/update per-agent autonomy
- Approval timeout policies (wait, deny, tiered, escalation) and park/resume of agent execution contexts
Configuration Changes
- Autonomy moved to preset-based config and per-agent/department overrides
- New approval timeout configuration options
Observability
- New autonomy and timeout event types for monitoring lifecycle and decisions

Walkthrough

Adds a complete autonomy + approval-timeout subsystem: enums, models, resolver, change strategy, timeout policies, park/resume, persistence (ParkedContext + repo + migration), SecOps/autonomy wiring, prompt/engine threading, API controller, observability events, and comprehensive tests.

Changes

Cohort / File(s)	Summary
Core enums & configs `src/ai_company/core/enums.py`, `src/ai_company/config/schema.py`, `src/ai_company/core/company.py`, `src/ai_company/core/agent.py`	Add `AutonomyLevel`, downgrade/timeout enums; company/agent config gains autonomy fields and approval_timeout; float→AutonomyConfig coercion; agent-level validation (seniority vs FULL).
Autonomy models & strategies `src/ai_company/security/autonomy/models.py`, `.../protocol.py`, `.../change_strategy.py`, `.../resolver.py`, `.../__init__.py`	New AutonomyPreset/AutonomyConfig/EffectiveAutonomy/AutonomyOverride; `AutonomyChangeStrategy` protocol; `HumanOnlyPromotionStrategy` implementation; `AutonomyResolver` for agent→dept→company resolution and action expansion.
Timeout policy surface `src/ai_company/security/timeout/config.py`, `.../models.py`, `.../protocol.py`, `.../factory.py`	Introduce discriminated ApprovalTimeoutConfig (wait/deny/tiered/escalation), `TimeoutAction` model, `TimeoutPolicy`/`RiskTierClassifier` protocols, and factory to instantiate policies.
Timeout policies & classifier `src/ai_company/security/timeout/policies.py`, `.../risk_tier_classifier.py`, `.../timeout_checker.py`, `.../__init__.py`	Implement four policies (WaitForever, DenyOnTimeout, Tiered, EscalationChain), default risk-tier classifier, and `TimeoutChecker` to evaluate/resolve approval items.
Park/resume persistence `src/ai_company/security/timeout/parked_context.py`, `.../park_service.py`	Add `ParkedContext` model (frozen, JSON context), and `ParkService` to park/resume AgentContext with logging/validation.
Persistence repo & SQLite impl `src/ai_company/persistence/repositories.py`, `.../protocol.py`, `.../sqlite/parked_context_repo.py`, `.../sqlite/backend.py`, `.../sqlite/migrations.py`	Add `ParkedContextRepository` protocol; SQLite implementation `SQLiteParkedContextRepository`; backend property `parked_contexts`; DB migration v3 creating parked_contexts table and indexes.
Security integration `src/ai_company/security/service.py`	Thread `EffectiveAutonomy` and `RiskTierClassifier` into `SecOpsService`; apply autonomy augmentation after rule evaluation to handle auto-approve / human-required actions and escalation logic.
Engine & prompt wiring `src/ai_company/engine/agent_engine.py`, `src/ai_company/engine/prompt.py`, `src/ai_company/engine/prompt_template.py`, `src/ai_company/engine/loop_protocol.py`	Thread `effective_autonomy` through AgentEngine, prompt builders, and trimming; bump prompt template version and render autonomy block; add PARKED termination reason and validation.
API & templates `src/ai_company/api/controllers/autonomy.py`, `src/ai_company/api/controllers/__init__.py`, `src/ai_company/templates/renderer.py`	New `AutonomyController` (GET/POST) and export; renderer refactor (module-level Jinja env), personality resolution change, and numeric-config backward compatibility.
Observability events `src/ai_company/observability/events/autonomy.py`, `.../persistence.py`, `.../timeout.py`	Add AUTONOMY_* and TIMEOUT_* event constants and parked-context persistence events.
Tests & test fixtures `tests/unit/**` (many files, see diff)	Extensive unit tests added/updated: autonomy models/resolver/change strategy, timeout configs/policies/checker, ParkService and parked-context repo, controller tests, prompt rendering, migrations, and test fixtures exposing fake parked_context repos.
Docs `DESIGN_SPEC.md`, `CLAUDE.md`, `README.md`	Design and docs updated to describe autonomy, timeout policies, park/resume, and observability guidance.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant AutonomyResolver
    participant AutonomyConfig
    participant ActionRegistry

    Client->>AutonomyResolver: resolve(agent_level, dept_level, seniority)
    AutonomyResolver->>AutonomyConfig: select preset (agent>dept>company)
    AutonomyConfig->>AutonomyConfig: fetch preset for level
    AutonomyResolver->>ActionRegistry: expand patterns -> concrete actions
    ActionRegistry-->>AutonomyResolver: return action set
    AutonomyResolver-->>Client: EffectiveAutonomy(level, auto_approve, human_approval, security_agent)

sequenceDiagram
    participant AgentEngine
    participant SecOpsService
    participant TimeoutChecker
    participant TimeoutPolicy
    participant ParkService
    participant Persistence

    AgentEngine->>SecOpsService: evaluate action with EffectiveAutonomy
    SecOpsService->>SecOpsService: is action auto_approved?
    alt Auto-approved
        SecOpsService-->>AgentEngine: ALLOW (proceed)
    else Human approval required
        SecOpsService->>TimeoutChecker: check_and_resolve(approval_item)
        TimeoutChecker->>TimeoutPolicy: determine_action(item, elapsed)
        TimeoutPolicy-->>TimeoutChecker: WAIT / APPROVE / DENY / ESCALATE
        alt WAIT
            SecOpsService->>ParkService: park(context)
            ParkService->>Persistence: save(ParkedContext)
            Persistence-->>ParkService: parked_id
            ParkService-->>SecOpsService: parked_context
            SecOpsService-->>AgentEngine: Task parked (agent continues others)
        else APPROVE/DENY/ESCALATE
            SecOpsService-->>AgentEngine: return verdict (proceed/escalate/deny)
        end
    end

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

feat: add autonomy levels and approval timeout policies (#42, #126) #197 — Very similar large-surface PR that introduces autonomy models/resolver/strategy, timeout/parked-context APIs, persistence + SQLite parked_context repo, enums, events, and threading EffectiveAutonomy through engine/prompt/SecOps (highly related).
feat: add pluggable PersistenceBackend protocol with SQLite implementation #179 — Related persistence changes; introduces or extends persistence backend infrastructure that this PR extends with parked context repository, migrations, and persistence event constants.
feat: add coordination error taxonomy classification pipeline (#146) #181 — AgentEngine and execution-path changes; this PR threads effective_autonomy through the same engine functions and helpers.

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/autonomy-levels-and-approval-timeout

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch feat/autonomy-levels-and-approval-timeout

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-03-10T12:03:53Z

Greptile Summary

This PR implements the autonomy levels (#42) and approval timeout policies (#126) features — a substantial addition covering AutonomyResolver, four TimeoutPolicy implementations, ParkService/ParkedContext park-resume flow, a SQLite v3 migration, and the AutonomyController API endpoint. The overall architecture is well-structured: autonomy augmentation is applied strictly after the rule engine (so security detectors can never be bypassed), the discriminated-union timeout config is clean, and the persistence layer follows existing repository patterns correctly.

Two blocking issues require fixes before merging:

Python 3 SyntaxError in service.py and timeout_checker.py: except MemoryError, RecursionError: is Python 2 syntax. All five occurrences (lines 158, 239, 338, 390 in service.py and line 65 in timeout_checker.py) will cause a SyntaxError at module import time, making the entire security subsystem and timeout checker non-functional. The fix is except (MemoryError, RecursionError): throughout.
EscalationChainPolicy off-by-one role selection: When step N's timeout expires, the policy escalates to chain[N+1].role instead of chain[N].role. chain[0].role is dead code — never used as an escalation target. A single-step chain will never escalate to any role; it jumps straight to on_chain_exhausted. The escalation target should be the role of the step whose timeout just expired, not the next step.

Additional non-blocking finding:

The POST /agents/{agent_id}/autonomy handler logs AUTONOMY_PROMOTION_DENIED for all requests, including demotions, which will produce misleading audit entries.

Confidence Score: 1/5

Not safe to merge — two blocking bugs: a Python 3 SyntaxError that disables the security subsystem at import time, and an off-by-one in the escalation chain that routes escalations to the wrong role.
The Python 2 except A, B: syntax in service.py and timeout_checker.py is a SyntaxError in Python 3 and will prevent the security module from loading at all. The EscalationChainPolicy incorrectly targets chain[N+1].role when chain[N]'s timeout expires, meaning every configured escalation role is wrong by one position and single-step chains never escalate. Both issues must be fixed before the PR is safe to deploy. The underlying architecture and the rest of the implementation are solid.
src/ai_company/security/service.py (SyntaxError, lines 158/239/338/390), src/ai_company/security/timeout/timeout_checker.py (SyntaxError, line 65), src/ai_company/security/timeout/policies.py (EscalationChainPolicy off-by-one role selection).

Important Files Changed

Filename	Overview
src/ai_company/security/service.py	Five occurrences of invalid Python 3 `except MemoryError, RecursionError:` syntax will cause a SyntaxError at import time, making the entire security subsystem non-functional.
src/ai_company/security/timeout/policies.py	EscalationChainPolicy has an off-by-one role selection bug: when step[N] times out, the policy escalates to step[N+1].role instead of step[N].role; step[0].role is dead code and single-step chains never escalate to any role.
src/ai_company/security/timeout/timeout_checker.py	Contains one Python 2-style `except MemoryError, RecursionError:` (SyntaxError in Python 3); the rest of the service is well-structured with correct fail-safe fallback to WAIT.
src/ai_company/api/controllers/autonomy.py	GET handler reads company-level default instead of resolving per-agent/department (flagged in prior thread); POST endpoint always emits AUTONOMY_PROMOTION_DENIED even for demotions. Core wiring and guards are otherwise correct.
src/ai_company/security/autonomy/resolver.py	Three-level resolution chain (agent → department → company) and category expansion logic are clean and correct; seniority constraint and logging are properly implemented.
src/ai_company/security/autonomy/models.py	AutonomyPreset, EffectiveAutonomy, and AutonomyOverride models are well-validated (disjoint sets, downgrade direction), frozen, and include sensible built-in presets.
src/ai_company/security/timeout/config.py	Discriminated union of four timeout configs is well-designed; TierConfig rejects ESCALATE (no target), EscalationChainConfig rejects empty chains and ESCALATE as on_chain_exhausted — all validated correctly.
src/ai_company/persistence/sqlite/parked_context_repo.py	Clean SQLite CRUD implementation: explicit column SELECTs, raises QueryError instead of returning None on row failure, and properly double-encodes the metadata JSON string for storage.
src/ai_company/security/autonomy/change_strategy.py	HumanOnlyPromotionStrategy correctly prevents autonomy escalation via in-memory overrides with exhaustiveness check on _DOWNGRADE_MAP; in-memory storage is acknowledged as a known limitation in the docstring.
src/ai_company/security/timeout/park_service.py	ParkService correctly uses model_dump_json/model_validate_json for round-trip serialization; defensive ID-consistency checks after construction are a good addition.
src/ai_company/persistence/sqlite/migrations.py	V3 migration adding parked_contexts table is correct; includes indexes on agent_id and approval_id; schema version bump and migration registration are clean.
src/ai_company/security/timeout/risk_tier_classifier.py	DefaultRiskTierClassifier correctly defaults unknown action types to HIGH (fail-safe per D19); exhaustiveness check at module load time catches newly added ActionType members.
src/ai_company/core/company.py	Backward-compatible float→AutonomyConfig coercion via before-validator is clean; _float_to_autonomy_level thresholds and WaitForeverConfig default are sensible.

Sequence Diagram

sequenceDiagram
    participant Engine as AgentEngine
    participant SecOps as SecOpsService
    participant Rules as RuleEngine
    participant Autonomy as AutonomyAugmentation
    participant Store as ApprovalStore
    participant Park as ParkService
    participant Repo as ParkedContextRepository
    participant Timeout as TimeoutChecker

    Engine->>SecOps: intercept(context, effective_autonomy)
    SecOps->>Rules: evaluate(context)
    Rules-->>SecOps: SecurityVerdict (ALLOW/DENY/ESCALATE)
    SecOps->>Autonomy: _apply_autonomy_augmentation(context, verdict)
    Note over Autonomy: DENY/ESCALATE from rules always wins<br/>ALLOW may be upgraded to ESCALATE<br/>based on human_approval_actions
    Autonomy-->>SecOps: verdict (possibly upgraded)
    alt verdict == ESCALATE
        SecOps->>Store: add(ApprovalItem)
        SecOps-->>Engine: verdict with approval_id
        Engine->>Park: park(context, approval_id)
        Park-->>Engine: ParkedContext
        Engine->>Repo: save(ParkedContext)
        Engine-->>Engine: ExecutionResult(PARKED)
    else verdict == ALLOW
        SecOps-->>Engine: allow
    else verdict == DENY
        SecOps-->>Engine: deny
    end

    Note over Timeout: Background / periodic poll
    Timeout->>Store: get pending items
    Timeout->>Timeout: check(item, elapsed_seconds)
    alt APPROVE / DENY
        Timeout->>Store: update item status
    else ESCALATE
        Timeout->>Store: re-route to next role
    end

Comments Outside Diff (2)

src/ai_company/security/service.py, line 158-159 (link)

Invalid Python 3 except syntax — SyntaxError at import time

except MemoryError, RecursionError: is Python 2 syntax. In Python 3 this is a SyntaxError, not valid tuple-catching. The entire service.py module will fail to import, making the whole security subsystem non-functional regardless of what the tests report.

The correct Python 3 syntax is to wrap both exception types in parentheses. All four occurrences in this file (lines 158, 239, 338, 390) and the fifth in src/ai_company/security/timeout/timeout_checker.py:65 share the same problem.

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/ai_company/security/service.py
Line: 158-159

Comment:
**Invalid Python 3 `except` syntax — `SyntaxError` at import time**

`except MemoryError, RecursionError:` is Python 2 syntax. In Python 3 this is a `SyntaxError`, not valid tuple-catching. The entire `service.py` module will fail to import, making the whole security subsystem non-functional regardless of what the tests report.

The correct Python 3 syntax is to wrap both exception types in parentheses. All four occurrences in this file (lines 158, 239, 338, 390) and the fifth in `src/ai_company/security/timeout/timeout_checker.py:65` share the same problem.



How can I resolve this? If you propose a fix, please make it concise.

src/ai_company/api/controllers/autonomy.py, line 284-291 (link)

AUTONOMY_PROMOTION_DENIED logged for all change requests, including demotions

AUTONOMY_PROMOTION_DENIED is emitted unconditionally regardless of whether the requested level is higher or lower than the current level. Requesting LOCKED (a demotion) or SUPERVISED from FULL would still log "Autonomy level changes require human approval" under the AUTONOMY_PROMOTION_DENIED event, making audit trails confusing.

Consider branching on compare_autonomy(requested_level, current_level) and emitting a dedicated event (e.g., AUTONOMY_DEMOTION_PENDING) for downgrade requests vs. promotions.

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/ai_company/api/controllers/autonomy.py
Line: 284-291

Comment:
**`AUTONOMY_PROMOTION_DENIED` logged for all change requests, including demotions**

`AUTONOMY_PROMOTION_DENIED` is emitted unconditionally regardless of whether the requested level is higher or lower than the current level. Requesting `LOCKED` (a demotion) or `SUPERVISED` from `FULL` would still log "Autonomy level changes require human approval" under the `AUTONOMY_PROMOTION_DENIED` event, making audit trails confusing.

Consider branching on `compare_autonomy(requested_level, current_level)` and emitting a dedicated event (e.g., `AUTONOMY_DEMOTION_PENDING`) for downgrade requests vs. promotions.

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: 4deae2a}

src/ai_company/api/controllers/autonomy.py

src/ai_company/security/service.py

src/ai_company/security/autonomy/change_strategy.py

Copilot

Pull request overview

Implements autonomy-level based approval routing and configurable approval-timeout handling, including task park/resume persistence, and integrates these concepts into the engine prompt/security flow.

Changes:

Add autonomy subsystem (models, resolver, change strategy) and wire “effective autonomy” into SecOps pre-tool checks and system prompt rendering.
Add approval timeout subsystem (policies, classifier, checker, config/factory) plus parked-context model/service and SQLite persistence + migration v3.
Add API endpoint for reading/requesting autonomy changes, new observability event constants, and broad unit test coverage for the new subsystems.

Reviewed changes

Copilot reviewed 57 out of 59 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tests/unit/security/timeout/test_timeout_checker.py	Unit tests for TimeoutChecker behavior and resolution updates.
tests/unit/security/timeout/test_risk_tier_classifier.py	Tests default/custom mappings and unknown-action fallback for risk tiers.
tests/unit/security/timeout/test_policies.py	Tests wait/deny/tiered/escalation timeout policies.
tests/unit/security/timeout/test_parked_context.py	Tests ParkedContext validation and immutability.
tests/unit/security/timeout/test_park_service.py	Tests park/resume serialization round-trip for AgentContext.
tests/unit/security/timeout/test_factory.py	Tests timeout policy factory returns correct implementations.
tests/unit/security/timeout/test_config.py	Tests timeout config discriminated union + TimeoutAction validator.
tests/unit/security/timeout/init.py	Test package marker for timeout tests.
tests/unit/security/test_service.py	Adds coverage for SecOpsService autonomy pre-check routing.
tests/unit/security/autonomy/test_resolver.py	Tests autonomy resolution chain + category/all expansion + seniority constraints.
tests/unit/security/autonomy/test_models.py	Tests autonomy preset/config/effective models and disjoint validation.
tests/unit/security/autonomy/test_change_strategy.py	Tests HumanOnlyPromotionStrategy promotion/downgrade/recovery behavior.
tests/unit/security/autonomy/init.py	Test package marker for autonomy tests.
tests/unit/persistence/test_protocol.py	Extends fake persistence backend/protocol coverage for parked contexts repo.
tests/unit/persistence/test_migrations_v2.py	Updates schema version assertions to v3.
tests/unit/persistence/sqlite/test_parked_context_repo.py	Adds CRUD/upsert/ordering/deserialization-failure tests for parked contexts repo.
tests/unit/persistence/sqlite/test_migrations.py	Adds assertions for v3 parked_contexts table + indexes.
tests/unit/observability/test_events.py	Expands expected event modules and asserts new autonomy/timeout/persistence events.
tests/unit/engine/test_prompt.py	Tests effective autonomy section inclusion/omission in system prompt.
tests/unit/engine/test_loop_protocol.py	Updates TerminationReason enum tests for new PARKED value.
tests/unit/core/test_company.py	Updates CompanyConfig autonomy field semantics + adds approval_timeout tests.
tests/unit/core/conftest.py	Updates factories to provide AutonomyConfig defaults.
tests/unit/api/controllers/test_autonomy.py	Tests new autonomy controller GET/POST and access controls.
tests/unit/api/conftest.py	Adds FakeParkedContextRepository into fake persistence backend for API tests.
src/ai_company/templates/renderer.py	Reuses module-level Jinja env; supports autonomy config dict passthrough; fixes personality handling to avoid mutation.
src/ai_company/security/timeout/timeout_checker.py	Adds TimeoutChecker for evaluating pending approvals against a TimeoutPolicy.
src/ai_company/security/timeout/risk_tier_classifier.py	Adds DefaultRiskTierClassifier with fail-safe HIGH fallback + logging.
src/ai_company/security/timeout/protocol.py	Defines TimeoutPolicy and RiskTierClassifier protocols.
src/ai_company/security/timeout/policies.py	Implements WaitForever/DenyOnTimeout/TieredTimeout/EscalationChain timeout policies.
src/ai_company/security/timeout/parked_context.py	Adds ParkedContext Pydantic model for serialized parked executions.
src/ai_company/security/timeout/park_service.py	Adds ParkService to serialize/deserialize AgentContext for park/resume.
src/ai_company/security/timeout/models.py	Adds TimeoutAction model + escalate_to consistency validator.
src/ai_company/security/timeout/factory.py	Adds create_timeout_policy factory for policy configs.
src/ai_company/security/timeout/config.py	Adds discriminated-union timeout policy configuration models.
src/ai_company/security/timeout/init.py	Exposes timeout subsystem public API.
src/ai_company/security/service.py	Adds effective autonomy integration and pre-check routing in SecOpsService.
src/ai_company/security/autonomy/resolver.py	Adds AutonomyResolver for agent→department→company resolution + pattern expansion + seniority validation.
src/ai_company/security/autonomy/protocol.py	Defines AutonomyChangeStrategy protocol.
src/ai_company/security/autonomy/models.py	Adds autonomy presets/config/effective models + runtime override model.
src/ai_company/security/autonomy/change_strategy.py	Adds HumanOnlyPromotionStrategy (deny promotions/recovery; apply auto-downgrades).
src/ai_company/security/autonomy/init.py	Exposes autonomy subsystem public API.
src/ai_company/persistence/sqlite/parked_context_repo.py	Adds SQLiteParkedContextRepository implementation.
src/ai_company/persistence/sqlite/migrations.py	Bumps schema to v3 and adds parked_contexts table + indexes migration.
src/ai_company/persistence/sqlite/backend.py	Wires parked_contexts repository into SQLite persistence backend.
src/ai_company/persistence/repositories.py	Adds ParkedContextRepository protocol.
src/ai_company/persistence/protocol.py	Extends PersistenceBackend protocol with parked_contexts repository.
src/ai_company/observability/events/timeout.py	Adds timeout event constants.
src/ai_company/observability/events/persistence.py	Adds parked-context persistence event constants.
src/ai_company/observability/events/autonomy.py	Adds autonomy subsystem event constants.
src/ai_company/engine/prompt_template.py	Adds effective autonomy section to system prompt template.
src/ai_company/engine/prompt.py	Passes effective autonomy into prompt context and renders it.
src/ai_company/engine/loop_protocol.py	Adds PARKED termination reason + validation rules.
src/ai_company/engine/agent_engine.py	Threads effective_autonomy into tool invoker/security interceptor and prompt building.
src/ai_company/core/enums.py	Adds AutonomyLevel, DowngradeReason, TimeoutActionType enums.
src/ai_company/core/company.py	Adds autonomy config and approval_timeout to CompanyConfig; adds dept autonomy override field.
src/ai_company/core/agent.py	Adds per-agent autonomy override field to AgentIdentity.
src/ai_company/config/schema.py	Adds autonomy_level to AgentConfig schema.
src/ai_company/api/controllers/autonomy.py	Adds AutonomyController GET/POST endpoints for autonomy level.
src/ai_company/api/controllers/init.py	Registers AutonomyController in controllers module exports/imports.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-10T12:04:57Z

src/ai_company/security/service.py

+        if action in autonomy.auto_approve_actions:
+            logger.info(
+                AUTONOMY_ACTION_AUTO_APPROVED,
+                tool_name=context.tool_name,
+                action_type=action,
+                autonomy_level=autonomy.level.value,
+            )
+            return SecurityVerdict(
+                verdict=SecurityVerdictType.ALLOW,
+                reason=f"Auto-approved by autonomy level '{autonomy.level.value}'",
+                risk_level=ApprovalRiskLevel.LOW,
+                evaluated_at=now,
+                evaluation_duration_ms=0.0,
+            )
+
+        if action in autonomy.human_approval_actions:
+            logger.info(
+                AUTONOMY_ACTION_HUMAN_REQUIRED,
+                tool_name=context.tool_name,
+                action_type=action,
+                autonomy_level=autonomy.level.value,
+            )
+            return SecurityVerdict(
+                verdict=SecurityVerdictType.ESCALATE,
+                reason=(
+                    f"Human approval required by autonomy level "
+                    f"'{autonomy.level.value}'"
+                ),
+                risk_level=ApprovalRiskLevel.MEDIUM,
+                evaluated_at=now,
+                evaluation_duration_ms=0.0,
+            )


_check_autonomy() hard-codes risk_level to LOW for auto-approved actions and MEDIUM for human-required actions. With presets like SEMI auto-approving broad categories (e.g., "code", "vcs"), this can mislabel genuinely HIGH/CRITICAL actions (and also affects what gets stored on ApprovalItem when escalating). Use the existing risk classification logic from the security subsystem to derive risk_level from action_type (or run the rule engine and reuse its computed risk) instead of hard-coding tiers here.

Copilot · 2026-03-10T12:04:58Z

src/ai_company/security/timeout/policies.py

+        cumulative_seconds = 0.0
+        for step in self._chain:
+            step_timeout = step.timeout_minutes * _SECONDS_PER_MINUTE
+            if elapsed_seconds < cumulative_seconds + step_timeout:
+                logger.debug(
+                    TIMEOUT_WAITING,
+                    approval_id=item.id,
+                    escalation_role=step.role,
+                    elapsed_seconds=elapsed_seconds,
+                )
+                return TimeoutAction(
+                    action=TimeoutActionType.ESCALATE,
+                    reason=(
+                        f"Escalated to {step.role!r} — {elapsed_seconds:.0f}s elapsed"
+                    ),
+                    escalate_to=step.role,
+                )


EscalationChainPolicy.determine_action() returns ESCALATE for the entire duration of the current step (elapsed_seconds < cumulative + step_timeout). If TimeoutChecker runs periodically, callers will repeatedly receive ESCALATE for the same escalate_to target and may re-escalate/notify on every tick (there’s no state on ApprovalItem to dedupe). Consider returning WAIT while within a step and only emitting ESCALATE when transitioning to the next step (or include enough state/metadata to make escalation idempotent).

Copilot · 2026-03-10T12:04:58Z

src/ai_company/api/controllers/autonomy.py

+        """Request an autonomy level change for an agent.
+
+        Validates seniority constraints and routes through the
+        configured ``AutonomyChangeStrategy``.  Returns 200 with the
+        current level.  If the change requires human approval, the
+        response includes ``promotion_pending=True``.
+


update_autonomy() docstring says it “Validates seniority constraints and routes through the configured AutonomyChangeStrategy”, but the implementation doesn’t consult agent/department overrides, doesn’t validate seniority, and doesn’t call any strategy (it always returns promotion_pending=True). Please either implement the documented behavior (wire in resolver/strategy + validation) or adjust the docstring/response fields to match the current placeholder behavior to avoid misleading API consumers.

Copilot · 2026-03-10T12:04:58Z

src/ai_company/core/company.py

+    @model_validator(mode="before")
+    @classmethod
+    def _coerce_autonomy_float(cls, data: object) -> object:
+        """Accept a bare float for autonomy and convert to AutonomyConfig."""
+        if not isinstance(data, dict):
+            return data
+        raw = data.get("autonomy")
+        if isinstance(raw, (int, float)) and not isinstance(raw, bool):
+            level = _float_to_autonomy_level(float(raw))
+            return {**data, "autonomy": {"level": level.value}}
+        return data


CompanyConfig._coerce_autonomy_float() converts any numeric autonomy value into an AutonomyConfig without validating the old 0.0–1.0 contract. This regresses prior bounds checks (e.g., -0.1 or 2.0 will silently map to LOCKED/FULL), and NaN/inf will also map to FULL due to comparison semantics. Add explicit validation (finite + 0.0 <= value <= 1.0) before calling _float_to_autonomy_level(), and raise a clear ValueError on invalid inputs.

Copilot · 2026-03-10T12:04:58Z

src/ai_company/security/service.py

+        # Autonomy pre-check: route based on effective autonomy before
+        # the full rule engine.  Hard-deny is always checked first.
+        autonomy_result = await self._apply_autonomy_precheck(context)
+        if autonomy_result is not None:
+            return autonomy_result


The autonomy pre-check short-circuits the rule engine (evaluate_pre_tool() returns early when _apply_autonomy_precheck() yields a verdict). This bypasses the rule engine’s detectors (credential/path traversal/data leak, etc.) even for actions that are “auto-approved” by autonomy presets, which can materially weaken security guarantees compared to the existing PolicyValidator behavior (auto-approve should not skip remaining detection rules). Consider always running the rule engine first, then applying autonomy routing as a post-processing step (e.g., convert ALLOW→ESCALATE when autonomy requires human approval), while still respecting DENY/ESCALATE produced by detectors.

coderabbitai

Actionable comments posted: 37

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/ai_company/engine/agent_engine.py (1)

683-726: ⚠️ Potential issue | 🔴 Critical

Don’t make autonomy enforcement conditional on SecurityConfig.

effective_autonomy only reaches SecOpsService, but
_make_security_interceptor() returns None when security is absent/disabled.
In that branch _make_tool_invoker() still builds a ToolInvoker, so actions
that should park or require approval can run with only the static
tool-permission check. Fail closed here, or provide an autonomy-only
interceptor.

🚫 Minimal fail-closed fix

     def _make_security_interceptor(
         self,
         effective_autonomy: EffectiveAutonomy | None = None,
     ) -> SecurityInterceptionStrategy | None:
         """Build the SecOps security interceptor if configured."""
         if self._security_config is None:
+            if effective_autonomy is not None:
+                msg = (
+                    "effective_autonomy cannot be enforced without SecurityConfig"
+                )
+                logger.error(SECURITY_DISABLED, note=msg)
+                raise ExecutionStateError(msg)
             logger.warning(
                 SECURITY_DISABLED,
                 note="No SecurityConfig provided — all security checks skipped",
             )
             return None
         if not self._security_config.enabled:
+            if effective_autonomy is not None:
+                msg = (
+                    "effective_autonomy cannot be enforced when security is disabled"
+                )
+                logger.error(SECURITY_DISABLED, note=msg)
+                raise ExecutionStateError(msg)
             return None

Also applies to: 728-741

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/engine/agent_engine.py` around lines 683 - 726, The current
_make_security_interceptor returns None when SecurityConfig is missing/disabled
which disables autonomy enforcement; change it to fail-closed by returning an
autonomy-only interceptor that implements SecurityInterceptionStrategy (instead
of None) so autonomy rules still apply even when other security detectors are
off: locate _make_security_interceptor and where it currently returns None for
self._security_config is None or not self._security_config.enabled, and replace
that branch with construction/return of a minimal interceptor (e.g., an
AutonomyEnforcementInterceptor or a SecOpsService instance configured only with
effective_autonomy, self._approval_store, and self._audit_log and no detectors)
so _make_tool_invoker can still rely on this interceptor to park/require
approval according to effective_autonomy.

tests/unit/persistence/test_protocol.py (1)

126-186: ⚠️ Potential issue | 🟡 Minor

Add the missing ParkedContextRepository conformance check.

test_fake_backend_is_persistence_backend() only proves that parked_contexts exists. It does not verify that _FakeParkedContextRepository itself satisfies the repository protocol, so signature drift on the fake can slip through this file.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/persistence/test_protocol.py` around lines 126 - 186, Update the
test to explicitly verify that the fake repository type matches the repository
protocol: import ParkedContextRepository and add a conformance assertion for
_FakeParkedContextRepository (e.g., assert
isinstance(_FakeParkedContextRepository(), ParkedContextRepository) or an
equivalent runtime/type-check that your test suite uses) inside
test_fake_backend_is_persistence_backend so that signature drift on
_FakeParkedContextRepository is caught; reference the symbols
_FakeParkedContextRepository and ParkedContextRepository when adding the check.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/api/controllers/autonomy.py`:
- Around line 73-81: get_autonomy() and update_autonomy() currently ignore
per-agent/department overrides and pending requests by always using
config.autonomy.level and flipping promotion_pending manually; replace this by
routing both handlers through the AutonomyResolver and AutonomyChangeStrategy so
they compute the effective level and validation logic, consult and update the
persistent pending-request store, and emit events only after the strategy
decides (e.g., persist pending vs deny vs apply). Specifically: in
get_autonomy() return AutonomyLevelResponse built from
AutonomyResolver.resolve(agent_id) plus promotion_pending read from the
pending-request store; in update_autonomy() call
AutonomyChangeStrategy.request_change(agent_id, requested_level, actor) (or the
resolver API) to validate/apply/persist the change, persist any pending request
in the store, and return the resulting effective level and promotion_pending
flag instead of echoing config.autonomy.level; ensure AUTONOMY_PROMOTION_DENIED
is emitted only when the strategy denies, not when a request is simply pending.

In `@src/ai_company/core/agent.py`:
- Around line 320-323: The AgentIdentity model currently allows an invalid
combination (level == JUNIOR with autonomy_level == FULL); add a Pydantic
validator (e.g., a root_validator or field validator) on the AgentIdentity class
to detect when level is AutonomyLevel.JUNIOR and autonomy_level is
AutonomyLevel.FULL and raise a ValueError describing the forbidden combination;
reference the autonomy_level and level fields in the check so construction fails
fast and prevents creating AgentIdentity(level=JUNIOR, autonomy_level=FULL).

In `@src/ai_company/core/company.py`:
- Around line 373-383: The before-validator _coerce_autonomy_float should reject
non-finite or out-of-range legacy numeric autonomy values before converting
them; import isfinite from math and, inside _coerce_autonomy_float, when raw is
a numeric (and not bool) first check isfinite(raw) and that raw is within the
valid range (e.g. 0.0 <= raw <= 1.0), and if not raise a ValueError with a clear
message instead of coercing; if it passes, continue to call
_float_to_autonomy_level and return the coerced dict as before.

In `@src/ai_company/engine/prompt_template.py`:
- Around line 144-154: The DEFAULT_TEMPLATE was modified to add autonomy fields
but PROMPT_TEMPLATE_VERSION was not updated; update the PROMPT_TEMPLATE_VERSION
constant (referenced as PROMPT_TEMPLATE_VERSION) to a new semver (e.g., "1.4.0")
so cache/snapshot/telemetry can distinguish the autonomy-aware template, and
ensure any tests or places that import PROMPT_TEMPLATE_VERSION are updated
accordingly; locate the constant near the top of the module and bump it to the
new version to match the DEFAULT_TEMPLATE change.

In `@src/ai_company/engine/prompt.py`:
- Around line 399-404: The projection of EffectiveAutonomy into ctx omits the
security_agent field; update the block that sets ctx["effective_autonomy"] (the
code using effective_autonomy.level, auto_approve_actions,
human_approval_actions) to also include security_agent (e.g., "security_agent":
effective_autonomy.security_agent or its serializable representation) so
templates receive the escalation reviewer info along with level and actions.

In `@src/ai_company/persistence/protocol.py`:
- Around line 114-116: The class docstring for PersistenceBackend is missing the
new public API attribute parked_contexts; update the PersistenceBackend
docstring Attributes section to add an entry for parked_contexts (similar style
to existing entries like collaboration_metrics) describing that it returns a
ParkedContextRepository for ParkedContext persistence and ensure it appears
after collaboration_metrics so generated docs reflect the new property.

In `@src/ai_company/security/autonomy/change_strategy.py`:
- Around line 92-108: The code unconditionally sets current_level =
_DOWNGRADE_MAP[reason], which can raise an agent's autonomy if a later downgrade
maps to a less restrictive level; modify auto_downgrade() logic so when
self._overrides.get(agent_id) exists you do not increase autonomy: compute
new_level = _DOWNGRADE_MAP[reason] but if existing is present set current_level
= the more restrictive of existing.current_level and new_level (i.e., do not
replace an equal-or-more-restrictive level such as LOCKED with a
less-restrictive one); update the AutonomyOverride creation to use that
non-escalating current_level and leave original_level unchanged (use symbols:
_DOWNGRADE_MAP, auto_downgrade(), self._overrides, AutonomyOverride).

In `@src/ai_company/security/autonomy/models.py`:
- Around line 62-135: BUILTIN_PRESETS and AutonomyConfig.presets must be
immutable; change BUILTIN_PRESETS to an immutable Mapping (wrap the literal dict
in types.MappingProxyType) and update AutonomyConfig.presets to use a
Mapping[str, AutonomyPreset] type and a default_factory that returns an
immutable deep-copy of the builtin mapping (e.g., return
types.MappingProxyType(copy.deepcopy(BUILTIN_PRESETS))). Import copy and
types.MappingProxyType and ensure you reference BUILTIN_PRESETS, AutonomyConfig,
and the presets field when making these edits.

In `@src/ai_company/security/autonomy/resolver.py`:
- Around line 50-109: The resolve method and related pattern expansion exceed
the 50-line limit and do multiple responsibilities; refactor by extracting
preset lookup/validation and per-pattern expansion into small helper methods
(e.g., create a _get_preset_or_raise(level) that encapsulates the preset lookup,
warning log and ValueError, and a _expand_pattern_list(patterns) that wraps
_expand_patterns per-item branching), update resolve to call
_get_preset_or_raise and _expand_pattern_list, keep validate_seniority usage and
EffectiveAutonomy construction unchanged, and ensure new helpers are
unit-testable and maintain existing log fields (AUTONOMY_RESOLVED,
resolved_level, agent_override, department_override, counts).

In `@src/ai_company/security/service.py`:
- Around line 148-152: The early return after calling _apply_autonomy_precheck
prevents context from reaching self._rule_engine.evaluate(context); change the
flow so that _apply_autonomy_precheck is used only to short-circuit on an
explicit hard-deny, but otherwise do not return early—always call
self._rule_engine.evaluate(context) when autonomy_result is not a hard-deny, and
then merge or reconcile autonomy_result.risk_level (or related fields) into the
rule engine result so the final decision preserves the full security assessment
(use _apply_autonomy_precheck, _check_autonomy, and self._rule_engine.evaluate
to locate and implement the merge/short-circuit logic).

In `@src/ai_company/security/timeout/config.py`:
- Around line 82-86: The tiers mapping currently typed as dict[str, TierConfig]
allows typos (e.g., "critcal") which silently fall back at runtime; change the
field to use a constrained key type and validate entries explicitly: replace
dict[str, TierConfig] with dict[Literal["low","medium","high","critical"],
TierConfig] (or add a validator on the tiers field) in the model that declares
policy and tiers, and add a pydantic validator that raises a clear
ValidationError if any key is not one of the allowed risk levels
(low/medium/high/critical) so mis-typed config keys fail fast with a helpful
message.
- Around line 57-64: TierConfig.on_timeout and
EscalationChainConfig.on_chain_exhausted currently allow
TimeoutActionType.ESCALATE even though the models don't capture an escalate_to
target, which later causes TimeoutAction(action=ESCALATE, ...) to be constructed
without escalate_to in TieredTimeoutPolicy and EscalationChainPolicy; add
validation on TierConfig and EscalationChainConfig (or their pydantic model
validators) to reject or coerce ESCALATE when no escalate_to is provided: check
the fields TierConfig.on_timeout and EscalationChainConfig.on_chain_exhausted
for TimeoutActionType.ESCALATE and raise a validation error (or change default
to DENY) when escalate target/role is absent so downstream code in
TieredTimeoutPolicy and EscalationChainPolicy will never receive an ESCALATE
action without an escalate_to.

In `@src/ai_company/security/timeout/factory.py`:
- Around line 47-51: When building a TieredTimeoutPolicy from a
TieredTimeoutConfig, validate each tier in config.tiers and reject any tier
whose on_timeout/action is ESCALATE but which lacks an escalate_to target; raise
a clear exception (e.g., ValueError) describing the offending tier so the config
fails fast. Perform this check in the factory branch that returns
TieredTimeoutPolicy (where TieredTimeoutConfig is handled and
DefaultRiskTierClassifier is used) before constructing the TieredTimeoutPolicy.
- Around line 59-63: The logger.warning call in timeout.factory uses a raw event
name string ("timeout.factory.unknown_config"); replace it with a domain event
constant by adding/exporting a timeout event constant (e.g.,
TIMEOUT_FACTORY_UNKNOWN_CONFIG) under ai_company.observability.events.timeout
and then import it into src.ai_company.security.timeout.factory (from
ai_company.observability.events.timeout import TIMEOUT_FACTORY_UNKNOWN_CONFIG)
and use that constant in the logger.warning call (keep the same config_type
kwarg). Also remove the unused msg variable if no longer needed.
- Around line 47-50: TieredTimeoutConfig.tiers is a mutable dict on a frozen
Pydantic model; avoid sharing it directly with the runtime by deep-copying and
making it read-only before constructing the policy. In the branch that returns
TieredTimeoutPolicy, replace passing config.tiers directly with a deep copy
(copy.deepcopy(config.tiers)) and wrap the result with MappingProxyType to
produce an immutable mapping, and ensure you import copy and
types.MappingProxyType; apply this change where TieredTimeoutPolicy(...) is
constructed (referencing TieredTimeoutConfig, TieredTimeoutPolicy,
DefaultRiskTierClassifier, and the tiers field).

In `@src/ai_company/security/timeout/park_service.py`:
- Around line 37-83: After serializing the AgentContext in park(), validate that
the extracted internal identifiers match the outer fields: compare
context.execution_id, context.agent_id, and context.task_id against the values
being stored in ParkedContext (execution_id, agent_id, task_id) and raise
ValueError if any mismatch; when creating ParkedContext ensure you store the
canonical values only after this check. Likewise, in resume() when you
deserialize context_json back to an AgentContext, verify that the deserialized
context.execution_id, context.agent_id, and context.task_id match the
ParkedContext.execution_id, ParkedContext.agent_id, and ParkedContext.task_id
and reject/resist resumption if they differ. Ensure checks reference the
ParkedContext class and the park() and resume() methods and keep metadata
handling unchanged.

In `@src/ai_company/security/timeout/parked_context.py`:
- Around line 29-44: The metadata dict on the frozen Pydantic model
ParkedContext can still be mutated by callers; fix this by deep-copying and
wrapping it with MappingProxyType at creation so it becomes immutable. Add a
field validator for "metadata" (e.g., a `@field_validator`("metadata") on
ParkedContext) that does: metadata = copy.deepcopy(metadata) and return
MappingProxyType(metadata); ensure imports for copy and MappingProxyType are
added and that the validator runs during model instantiation so persisted
snapshots cannot be mutated after creation.

In `@src/ai_company/security/timeout/policies.py`:
- Around line 148-149: determine_action() currently always uses
self._classifier.classify(item.action_type) and ignores TierConfig.actions;
change the logic to first check each TierConfig in self._tiers for a non-empty
actions tuple containing item.action_type and select that tier_config if found,
and only if no TierConfig.actions match fall back to calling
self._classifier.classify(item.action_type) and lookup tier_config by
risk_level.value; update references to tier_config, self._classifier.classify,
and TierConfig.actions accordingly so pinned actions are honored.

In `@src/ai_company/security/timeout/risk_tier_classifier.py`:
- Around line 12-45: Remove the duplicated _DEFAULT_RISK_MAP definition and
instead import and reuse the shared risk map from the existing risk classifier
module (e.g., import DEFAULT_RISK_MAP or the exported map from risk_classifier).
Update references in this file (any uses of _DEFAULT_RISK_MAP, ApprovalRiskLevel
and ActionType lookups) to use the imported map so timeout classification and
security use the single source of truth defined in risk_classifier.py.

In `@src/ai_company/security/timeout/timeout_checker.py`:
- Around line 36-68: Add a guard at the start of check to skip policy evaluation
for items whose ApprovalItem.status is not ApprovalStatus.PENDING: return a
no-op/neutral TimeoutAction (and log the skip) instead of calling
self._policy.determine_action; also ensure check_and_resolve performs the same
status guard before applying any resolution so already-APPROVED/REJECTED/EXPIRED
items cannot be overwritten by timeout logic.

In `@src/ai_company/templates/renderer.py`:
- Around line 526-529: The branch that handles raw_autonomy dicts reuses the
parsed dict instance (raw_autonomy) and assigns it directly to autonomy, causing
aliasing if later normalization/validation mutates it; change the assignment to
make a deep copy (e.g., autonomy = copy.deepcopy(raw_autonomy)) and add an
import for the copy module at the top of the file so the config tree always
receives an isolated dict instance (refer to raw_autonomy and the autonomy
variable in renderer.py).
- Around line 675-678: The code may pass a non-string YAML scalar as preset_name
into get_personality_preset (which calls .strip()), causing AttributeError
instead of the renderer's TemplateRenderError; before calling
get_personality_preset(preset_name) validate that preset_name is an instance of
str and, if not, raise TemplateRenderError with a clear message about invalid
preset type; keep the existing KeyError handling for unknown preset names so the
try/except around get_personality_preset still catches KeyError but the type
check prevents AttributeError from escaping.
- Around line 68-70: The module-level Jinja2 filter "auto" on _JINJA_ENV
incorrectly uses "value or ''" which collapses valid falsy values like 0 or
False; change the filter to only treat None or Jinja2 Undefined as missing —
i.e., import Jinja2's Undefined and update the lambda for
_JINJA_ENV.filters["auto"] to return "" when value is None or isinstance(value,
Undefined), otherwise return the original value (preserving 0, 0.0, False).

In `@tests/unit/api/conftest.py`:
- Around line 198-211: The fake in-memory repo stores and returns the same
ParkedContext instances which share mutable metadata dicts; modify the methods
in the test fake (save, get, get_by_approval, get_by_agent) to deepcopy
ParkedContext objects at the persistence boundary (use import copy at top), i.e.
store copy.deepcopy(context) inside save and return copy.deepcopy(...) from get,
get_by_approval, and get_by_agent so callers receive independent copies
mirroring SQLiteParkedContextRepository serialization behavior.

In `@tests/unit/engine/test_loop_protocol.py`:
- Around line 33-36: Add tests exercising the new PARKED rule: create one
passing test that constructs an ExecutionResult with
termination=TerminationReason.PARKED and asserts that result.error_message is
None (or that validation succeeds), and add one failing test that attempts to
create/validate an ExecutionResult with termination=TerminationReason.PARKED and
a non-None error_message and asserts the operation raises the expected
validation exception (e.g., ValueError or AssertionError); reference
ExecutionResult and TerminationReason.PARKED in the new tests so the
PARKED->error_message=None contract is enforced.

In `@tests/unit/persistence/sqlite/test_migrations.py`:
- Around line 94-115: Update the two tests to validate the parked_contexts
schema, not just names: after run_migrations in
test_v3_creates_parked_contexts_table call "PRAGMA table_info(parked_contexts)"
and assert the returned column names include the repository-required columns
(e.g., "id", "agent_id", "approval_id", plus any expected timestamp or payload
columns your code depends on); in test_v3_creates_parked_context_indexes after
finding index names run "PRAGMA index_info('idx_pc_agent_id')" and "PRAGMA
index_info('idx_pc_approval_id')" and assert those index_info results reference
the proper indexed columns ("agent_id" and "approval_id" respectively); keep
using run_migrations and the same test function names to locate where to add
these assertions.

In `@tests/unit/persistence/sqlite/test_parked_context_repo.py`:
- Around line 7-18: Add a module-level 30-second timeout by defining pytestmark
= pytest.mark.timeout(30) near the top of the test file (after the imports) so
all async tests in this module (which use SQLiteParkedContextRepository and
ParkedContext) get a global timeout; use the pytestmark symbol and
pytest.mark.timeout to implement this.

In `@tests/unit/persistence/test_migrations_v2.py`:
- Around line 31-32: The test test_schema_version_is_three currently only
asserts SCHEMA_VERSION == 3 and therefore misses verifying that v3 artifacts
were actually created; update this test to also query the database schema to
assert the parked_contexts table exists and that the two new indexes
idx_pc_agent_id and idx_pc_approval_id are present (on the upgrade path where
_apply_v3() should run). Locate test_schema_version_is_three and after
confirming SCHEMA_VERSION, run the same connection/PRAGMA or sqlite_master
queries used elsewhere in the suite to verify existence of the parked_contexts
table and that entries for idx_pc_agent_id and idx_pc_approval_id exist; fail
the test if any of those are missing so partial or skipped _apply_v3() is
caught. Ensure you reference the same DB handle/fixture used by other migration
tests so the checks run against the upgraded DB instance.

In `@tests/unit/security/autonomy/test_change_strategy.py`:
- Around line 62-70: Update the test_double_downgrade_preserves_original to
ensure the original_level is preserved: when using HumanOnlyPromotionStrategy
call auto_downgrade the first time with an explicit current_level (e.g.,
AutonomyLevel.HIGH or similar) for agent-1, then call auto_downgrade a second
time with a different DowngradeReason, fetch the override via get_override and
add an assertion that override.original_level equals the explicit level you
seeded; keep the existing assertions for override.current_level and
override.reason to verify the second downgrade replaced only the current state.

In `@tests/unit/security/autonomy/test_models.py`:
- Around line 122-125: The test_config_frozen currently only checks assignment
to the AutonomyConfig.level attribute but misses in-place mutation of the
mutable AutonomyConfig.presets dict; update the test_config_frozen to attempt an
in-place change to config.presets (e.g., mutating an existing key or adding a
key) and assert that this raises an exception or is prevented, and if underlying
model does not yet protect presets, change the AutonomyConfig construction to
deep-copy incoming presets and wrap them with MappingProxyType (using
copy.deepcopy in the AutonomyConfig __init__ or validator) so presets is
read-only; reference AutonomyConfig, presets, test_config_frozen, copy.deepcopy,
and MappingProxyType when making the fixes.

In `@tests/unit/security/autonomy/test_resolver.py`:
- Around line 106-129: Add a test that exercises the public API by calling
resolver.resolve(...) to ensure seniority enforcement is applied there as well:
in TestSeniorityValidation add a case that calls
resolver.resolve(agent_level=AutonomyLevel.FULL,
seniority=SeniorityLevel.JUNIOR) and asserts it raises ValueError (matching
"FULL autonomy"); keep existing validate_seniority() checks but include this
resolve(...) call so the public resolve method is validated for the same JUNIOR
+ FULL rejection.

In `@tests/unit/security/test_service.py`:
- Around line 489-579: Add a new async test (e.g.,
test_auto_approve_blocked_for_high_or_critical_risk) that constructs an
EffectiveAutonomy with the action present in auto_approve_actions, then creates
a context for that action with a HIGH (and/or CRITICAL) risk level (use
_make_context(action_type="...", risk_level=RiskLevel.HIGH) or otherwise mock
the risk classifier), calls service.evaluate_pre_tool(ctx) and asserts the
result is NOT SecurityVerdictType.ALLOW and that auto-approval was not used
(e.g., service._test_rule_engine.evaluate.assert_called_once() or that the
verdict.reason mentions escalation/review); reference EffectiveAutonomy,
auto_approve_actions, evaluate_pre_tool, and SecurityVerdictType.ALLOW when
adding the test.

In `@tests/unit/security/timeout/test_factory.py`:
- Around line 30-47: Update each test to assert behavior/wiring, not just type:
after calling create_timeout_policy with DenyOnTimeoutConfig(timeout_minutes=60)
assert the returned DenyOnTimeoutPolicy has its internal timeout represented as
3600 seconds (verify the concrete attribute on DenyOnTimeoutPolicy that stores
seconds); for TieredTimeoutConfig assert the returned TieredTimeoutPolicy
preserved the provided tier configuration (compare the policy's tiers/config
property to the original TieredTimeoutConfig values); for EscalationChainConfig
assert the returned EscalationChainPolicy preserved the chain and
on_chain_exhausted values (verify the policy.chain contains the
EscalationStep(role="lead", timeout_minutes=30) data and
policy.on_chain_exhausted equals TimeoutActionType.DENY). Ensure you reference
create_timeout_policy, DenyOnTimeoutConfig, DenyOnTimeoutPolicy,
TieredTimeoutConfig, TieredTimeoutPolicy, EscalationChainConfig,
EscalationChainPolicy, EscalationStep, and TimeoutActionType when locating the
fields to assert.

In `@tests/unit/security/timeout/test_park_service.py`:
- Around line 75-92: The test test_resume_restores_context uses
_make_agent_context() which returns an AgentContext with task_execution=None, so
add a real task-bound context before parking: build an AgentContext with a
non-None task_execution containing a Task (with id "task-1" or similar) and use
that context when calling ParkService().park(...); after resume, assert that
restored.task_execution is not None and that restored.task_execution.task.id
equals the original task id to ensure the task survives the round-trip through
ParkService.park and ParkService.resume.

In `@tests/unit/security/timeout/test_parked_context.py`:
- Around line 54-59: Update the frozen-model test and the persistence
deserialization: in the test_frozen() for the ParkedContext model, add an
assertion that an in-place mutation like parked.metadata["key"] = "value" either
raises (or does not change the model when re-fetched) to cover dict mutation on
a frozen Pydantic model; and in the persistence layer method _row_to_model (the
function that converts DB rows into ParkedContext instances) wrap the
deserialized metadata with copy.deepcopy(json.loads(raw_meta)) before
constructing the ParkedContext so the model receives a deep-copied dict and
in-place mutations at call sites cannot mutate shared state.

In `@tests/unit/security/timeout/test_policies.py`:
- Around line 89-141: Add tests that ensure the HIGH and CRITICAL tiers cannot
be auto-approved even if their TierConfig.on_timeout is set to APPROVE: create
TieredTimeoutPolicy instances with tiers like {"high":
TierConfig(timeout_minutes=1, on_timeout=TimeoutActionType.APPROVE)} and
{"critical": TierConfig(...)} and use DefaultRiskTierClassifier plus
_make_item(...) with an action_type that maps to HIGH/CRITICAL (e.g.,
"secrets:write" or another classifier-recognized high/critical action); call
policy.determine_action(item, elapsed_seconds) with elapsed_seconds >
timeout_minutes*60 and assert the returned result.action is
TimeoutActionType.WAIT (repeat or parametrize for both HIGH and CRITICAL to
cover both branches).

In `@tests/unit/security/timeout/test_risk_tier_classifier.py`:
- Around line 1-64: Add the module-level pytest timeout marker by defining
pytestmark = pytest.mark.timeout(30) immediately after the imports in this test
module; update the top of the file (near the imports that include pytest and
DefaultRiskTierClassifier) so the module-level marker applies to all tests (no
changes needed to DefaultRiskTierClassifier or individual test functions).

---

Outside diff comments:
In `@src/ai_company/engine/agent_engine.py`:
- Around line 683-726: The current _make_security_interceptor returns None when
SecurityConfig is missing/disabled which disables autonomy enforcement; change
it to fail-closed by returning an autonomy-only interceptor that implements
SecurityInterceptionStrategy (instead of None) so autonomy rules still apply
even when other security detectors are off: locate _make_security_interceptor
and where it currently returns None for self._security_config is None or not
self._security_config.enabled, and replace that branch with construction/return
of a minimal interceptor (e.g., an AutonomyEnforcementInterceptor or a
SecOpsService instance configured only with effective_autonomy,
self._approval_store, and self._audit_log and no detectors) so
_make_tool_invoker can still rely on this interceptor to park/require approval
according to effective_autonomy.

In `@tests/unit/persistence/test_protocol.py`:
- Around line 126-186: Update the test to explicitly verify that the fake
repository type matches the repository protocol: import ParkedContextRepository
and add a conformance assertion for _FakeParkedContextRepository (e.g., assert
isinstance(_FakeParkedContextRepository(), ParkedContextRepository) or an
equivalent runtime/type-check that your test suite uses) inside
test_fake_backend_is_persistence_backend so that signature drift on
_FakeParkedContextRepository is caught; reference the symbols
_FakeParkedContextRepository and ParkedContextRepository when adding the check.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 45709ea0-f32a-4c74-b095-2e041b61759b

📥 Commits

Reviewing files that changed from the base of the PR and between 02b5f4e and 1855156.

📒 Files selected for processing (59)

src/ai_company/api/controllers/__init__.py
src/ai_company/api/controllers/autonomy.py
src/ai_company/config/schema.py
src/ai_company/core/agent.py
src/ai_company/core/company.py
src/ai_company/core/enums.py
src/ai_company/engine/agent_engine.py
src/ai_company/engine/loop_protocol.py
src/ai_company/engine/prompt.py
src/ai_company/engine/prompt_template.py
src/ai_company/observability/events/autonomy.py
src/ai_company/observability/events/persistence.py
src/ai_company/observability/events/timeout.py
src/ai_company/persistence/protocol.py
src/ai_company/persistence/repositories.py
src/ai_company/persistence/sqlite/backend.py
src/ai_company/persistence/sqlite/migrations.py
src/ai_company/persistence/sqlite/parked_context_repo.py
src/ai_company/security/autonomy/__init__.py
src/ai_company/security/autonomy/change_strategy.py
src/ai_company/security/autonomy/models.py
src/ai_company/security/autonomy/protocol.py
src/ai_company/security/autonomy/resolver.py
src/ai_company/security/service.py
src/ai_company/security/timeout/__init__.py
src/ai_company/security/timeout/config.py
src/ai_company/security/timeout/factory.py
src/ai_company/security/timeout/models.py
src/ai_company/security/timeout/park_service.py
src/ai_company/security/timeout/parked_context.py
src/ai_company/security/timeout/policies.py
src/ai_company/security/timeout/protocol.py
src/ai_company/security/timeout/risk_tier_classifier.py
src/ai_company/security/timeout/timeout_checker.py
src/ai_company/templates/renderer.py
tests/unit/api/conftest.py
tests/unit/api/controllers/test_autonomy.py
tests/unit/core/conftest.py
tests/unit/core/test_company.py
tests/unit/engine/test_loop_protocol.py
tests/unit/engine/test_prompt.py
tests/unit/observability/test_events.py
tests/unit/persistence/sqlite/test_migrations.py
tests/unit/persistence/sqlite/test_parked_context_repo.py
tests/unit/persistence/test_migrations_v2.py
tests/unit/persistence/test_protocol.py
tests/unit/security/autonomy/__init__.py
tests/unit/security/autonomy/test_change_strategy.py
tests/unit/security/autonomy/test_models.py
tests/unit/security/autonomy/test_resolver.py
tests/unit/security/test_service.py
tests/unit/security/timeout/__init__.py
tests/unit/security/timeout/test_config.py
tests/unit/security/timeout/test_factory.py
tests/unit/security/timeout/test_park_service.py
tests/unit/security/timeout/test_parked_context.py
tests/unit/security/timeout/test_policies.py
tests/unit/security/timeout/test_risk_tier_classifier.py
tests/unit/security/timeout/test_timeout_checker.py

coderabbitai · 2026-03-10T12:12:57Z

src/ai_company/api/controllers/autonomy.py

+        app_state: AppState = state.app_state
+        config = app_state.config.config
+        level = config.autonomy.level
+        return ApiResponse(
+            data=AutonomyLevelResponse(
+                agent_id=agent_id,
+                level=level,
+            ),
+        )


⚠️ Potential issue | 🟠 Major

These endpoints never read or write per-agent autonomy state.

get_autonomy() always returns config.autonomy.level with promotion_pending=False, and update_autonomy() only logs before echoing the same level back with promotion_pending=True. That ignores agent overrides, department overrides, active downgrade overrides, seniority validation, and any persisted pending request, so every agent reports the same value and a follow-up GET immediately loses the pending state. It also emits AUTONOMY_PROMOTION_DENIED for requests the API is treating as pending. This needs to go through the actual AutonomyResolver / AutonomyChangeStrategy flow plus a store for pending requests.

Also applies to: 105-132

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/api/controllers/autonomy.py` around lines 73 - 81, get_autonomy() and update_autonomy() currently ignore per-agent/department overrides and pending requests by always using config.autonomy.level and flipping promotion_pending manually; replace this by routing both handlers through the AutonomyResolver and AutonomyChangeStrategy so they compute the effective level and validation logic, consult and update the persistent pending-request store, and emit events only after the strategy decides (e.g., persist pending vs deny vs apply). Specifically: in get_autonomy() return AutonomyLevelResponse built from AutonomyResolver.resolve(agent_id) plus promotion_pending read from the pending-request store; in update_autonomy() call AutonomyChangeStrategy.request_change(agent_id, requested_level, actor) (or the resolver API) to validate/apply/persist the change, persist any pending request in the store, and return the resulting effective level and promotion_pending flag instead of echoing config.autonomy.level; ensure AUTONOMY_PROMOTION_DENIED is emitted only when the strategy denies, not when a request is simply pending.

src/ai_company/core/agent.py

src/ai_company/core/company.py

src/ai_company/engine/prompt_template.py

src/ai_company/engine/prompt.py

coderabbitai · 2026-03-10T12:12:59Z

tests/unit/security/timeout/test_factory.py

+    @pytest.mark.unit
+    def test_deny_on_timeout(self) -> None:
+        result = create_timeout_policy(DenyOnTimeoutConfig(timeout_minutes=60))
+        assert isinstance(result, DenyOnTimeoutPolicy)
+
+    @pytest.mark.unit
+    def test_tiered(self) -> None:
+        result = create_timeout_policy(TieredTimeoutConfig())
+        assert isinstance(result, TieredTimeoutPolicy)
+
+    @pytest.mark.unit
+    def test_escalation_chain(self) -> None:
+        config = EscalationChainConfig(
+            chain=(EscalationStep(role="lead", timeout_minutes=30),),
+            on_chain_exhausted=TimeoutActionType.DENY,
+        )
+        result = create_timeout_policy(config)
+        assert isinstance(result, EscalationChainPolicy)


⚠️ Potential issue | 🟡 Minor

Assert the factory wiring, not just the concrete class.

These cases only verify dispatch. A regression in the timeout_minutes -> seconds conversion or in forwarding chain / on_chain_exhausted would still pass as long as the returned class stays the same. Please add at least one behavior-level assertion per configurable policy so this suite catches broken wiring as well as wrong type selection.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/security/timeout/test_factory.py` around lines 30 - 47, Update each test to assert behavior/wiring, not just type: after calling create_timeout_policy with DenyOnTimeoutConfig(timeout_minutes=60) assert the returned DenyOnTimeoutPolicy has its internal timeout represented as 3600 seconds (verify the concrete attribute on DenyOnTimeoutPolicy that stores seconds); for TieredTimeoutConfig assert the returned TieredTimeoutPolicy preserved the provided tier configuration (compare the policy's tiers/config property to the original TieredTimeoutConfig values); for EscalationChainConfig assert the returned EscalationChainPolicy preserved the chain and on_chain_exhausted values (verify the policy.chain contains the EscalationStep(role="lead", timeout_minutes=30) data and policy.on_chain_exhausted equals TimeoutActionType.DENY). Ensure you reference create_timeout_policy, DenyOnTimeoutConfig, DenyOnTimeoutPolicy, TieredTimeoutConfig, TieredTimeoutPolicy, EscalationChainConfig, EscalationChainPolicy, EscalationStep, and TimeoutActionType when locating the fields to assert.

coderabbitai · 2026-03-10T12:12:59Z

tests/unit/security/timeout/test_park_service.py

+    def test_resume_restores_context(self) -> None:
+        """Parks then resumes, verifies round-trip fidelity."""
+        context = _make_agent_context()
+        service = ParkService()
+
+        parked = service.park(
+            context=context,
+            approval_id="approval-1",
+            agent_id="agent-1",
+            task_id="task-1",
+        )
+
+        restored = service.resume(parked)
+
+        assert restored.execution_id == context.execution_id
+        assert restored.turn_count == context.turn_count
+        assert restored.identity.name == context.identity.name
+        assert restored.identity.role == context.identity.role


⚠️ Potential issue | 🟡 Minor

Round-trip coverage never exercises a task-bound context.

_make_agent_context() builds AgentContext(task_execution=None), so this test would still pass if park/resume silently dropped the blocked task. Since the feature is parking approval-gated work, please create the context with a real task and assert restored.task_execution.task.id survives the round-trip.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/security/timeout/test_park_service.py` around lines 75 - 92, The test test_resume_restores_context uses _make_agent_context() which returns an AgentContext with task_execution=None, so add a real task-bound context before parking: build an AgentContext with a non-None task_execution containing a Task (with id "task-1" or similar) and use that context when calling ParkService().park(...); after resume, assert that restored.task_execution is not None and that restored.task_execution.task.id equals the original task id to ensure the task survives the round-trip through ParkService.park and ParkService.resume.

coderabbitai · 2026-03-10T12:12:59Z

tests/unit/security/timeout/test_parked_context.py

+    def test_frozen(self) -> None:
+        """Cannot modify fields on a frozen model."""
+        parked = _make_parked_context()
+        with pytest.raises(ValidationError):
+            parked.agent_id = "other"  # type: ignore[misc]
+


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In Pydantic v2, does ConfigDict(frozen=True)prevent in-place mutation ofdictorlist fields, or only attribute reassignment?

💡 Result:

ConfigDict(frozen=True) only makes the model itself “faux-immutable” by blocking attribute setting (i.e., disallowing __setattr__, and thus preventing model.field = ...)—it does not deep-freeze contained values. So if a field value is a mutable object like a list or dict, in-place mutation still works (e.g., model.my_list.append(...), model.my_dict["k"]=...) because you’re not reassigning the attribute, you’re mutating the object it references. [1], [2]

Implication: if you need true immutability for containers, use immutable types (e.g., tuple, frozenset, immutable mapping wrappers) or wrap/convert during validation. [1], [2]

Sources

Pydantic config docs (frozen: blocks __setattr__) https://docs.pydantic.dev/latest/api/config/

Example noting underlying dict remains mutable even when model is frozen https://www.jujens.eu/posts/en/2025/Apr/19/using-custom-classes-pydantic/

🏁 Script executed:

cat -n src/ai_company/security/timeout/parked_context.py | head -50

Repository: Aureliolo/ai-company

Length of output: 2092

🏁 Script executed:

# Check where ParkedContext is used and how metadata is accessed/modified rg -n "ParkedContext" --type py -A 2 -B 2 | head -100

Repository: Aureliolo/ai-company

Length of output: 7561

🏁 Script executed:

# Check if metadata is mutated anywhere or if deepcopy is used rg -n "\.metadata" --type py -B 1 -A 1 | head -80

Repository: Aureliolo/ai-company

Length of output: 4971

🏁 Script executed:

# Check specifically for ParkedContext metadata mutations or deepcopy usage rg -n "parked.*metadata|metadata.*parked" --type py -B 2 -A 2

Repository: Aureliolo/ai-company

Length of output: 3988

🏁 Script executed:

# Check the persistence repo to see if deepcopy is used when loading/saving ParkedContext cat -n src/ai_company/persistence/sqlite/parked_context_repo.py | sed -n '40,90p'

Repository: Aureliolo/ai-company

Length of output: 2250

🏁 Script executed:

# Check the complete loading logic for ParkedContext cat -n src/ai_company/persistence/sqlite/parked_context_repo.py | sed -n '84,115p'

Repository: Aureliolo/ai-company

Length of output: 1433

🏁 Script executed:

# Find the _row_to_model method cat -n src/ai_company/persistence/sqlite/parked_context_repo.py | sed -n '1,40p'

Repository: Aureliolo/ai-company

Length of output: 1535

🏁 Script executed:

# Search for _row_to_model method rg -n "_row_to_model" src/ai_company/persistence/sqlite/parked_context_repo.py -A 15

Repository: Aureliolo/ai-company

Length of output: 2986

🏁 Script executed:

# Verify the test file doesn't have any in-place mutation tests for metadata cat -n tests/unit/security/timeout/test_parked_context.py | sed -n '54,85p'

Repository: Aureliolo/ai-company

Length of output: 1386

Extend the frozen model test to cover in-place metadata mutation, and apply copy.deepcopy() to the metadata field at the persistence boundary.

Pydantic v2's frozen=True only prevents attribute reassignment (line 58 checks this), but in-place mutation of dict fields still works. The metadata field on the frozen ParkedContext model can be mutated via parked.metadata["key"] = value, which the current test misses.

Additionally, per coding guidelines, dict/list fields on frozen Pydantic models must use copy.deepcopy() at system boundaries. The persistence layer's _row_to_model() method (line 171–181 in src/ai_company/persistence/sqlite/parked_context_repo.py) deserializes metadata without deepcopy, violating this requirement.

Fixes needed:

Update test_frozen() to verify parked.metadata["key"] = value raises an error or does not persist

Apply copy.deepcopy(json.loads(raw_meta)) in _row_to_model() before constructing ParkedContext

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/security/timeout/test_parked_context.py` around lines 54 - 59, Update the frozen-model test and the persistence deserialization: in the test_frozen() for the ParkedContext model, add an assertion that an in-place mutation like parked.metadata["key"] = "value" either raises (or does not change the model when re-fetched) to cover dict mutation on a frozen Pydantic model; and in the persistence layer method _row_to_model (the function that converts DB rows into ParkedContext instances) wrap the deserialized metadata with copy.deepcopy(json.loads(raw_meta)) before constructing the ParkedContext so the model receives a deep-copied dict and in-place mutations at call sites cannot mutate shared state.

tests/unit/security/timeout/test_policies.py

coderabbitai · 2026-03-10T12:12:59Z

tests/unit/security/timeout/test_risk_tier_classifier.py

+"""Tests for DefaultRiskTierClassifier."""
+
+import pytest
+
+from ai_company.core.enums import ActionType, ApprovalRiskLevel
+from ai_company.security.timeout.risk_tier_classifier import DefaultRiskTierClassifier
+
+
+class TestDefaultMapping:
+    """Default risk tier mapping."""
+
+    @pytest.mark.unit
+    def test_critical_actions(self) -> None:
+        classifier = DefaultRiskTierClassifier()
+        expected = ApprovalRiskLevel.CRITICAL
+        assert classifier.classify(ActionType.DEPLOY_PRODUCTION) == expected
+        assert classifier.classify(ActionType.DB_ADMIN) == expected
+
+    @pytest.mark.unit
+    def test_high_actions(self) -> None:
+        classifier = DefaultRiskTierClassifier()
+        assert classifier.classify(ActionType.VCS_PUSH) == ApprovalRiskLevel.HIGH
+        assert classifier.classify(ActionType.CODE_DELETE) == ApprovalRiskLevel.HIGH
+
+    @pytest.mark.unit
+    def test_medium_actions(self) -> None:
+        classifier = DefaultRiskTierClassifier()
+        assert classifier.classify(ActionType.CODE_WRITE) == ApprovalRiskLevel.MEDIUM
+
+    @pytest.mark.unit
+    def test_low_actions(self) -> None:
+        classifier = DefaultRiskTierClassifier()
+        assert classifier.classify(ActionType.CODE_READ) == ApprovalRiskLevel.LOW
+        assert classifier.classify(ActionType.TEST_RUN) == ApprovalRiskLevel.LOW
+
+
+class TestUnknownFallback:
+    """Unknown action types default to HIGH (D19)."""
+
+    @pytest.mark.unit
+    def test_unknown_defaults_to_high(self) -> None:
+        classifier = DefaultRiskTierClassifier()
+        assert classifier.classify("unknown:action") == ApprovalRiskLevel.HIGH
+
+
+class TestCustomMap:
+    """Custom risk overrides."""
+
+    @pytest.mark.unit
+    def test_custom_override(self) -> None:
+        classifier = DefaultRiskTierClassifier(
+            custom_map={ActionType.CODE_READ: ApprovalRiskLevel.CRITICAL}
+        )
+        assert classifier.classify(ActionType.CODE_READ) == ApprovalRiskLevel.CRITICAL
+
+    @pytest.mark.unit
+    def test_custom_preserves_defaults(self) -> None:
+        classifier = DefaultRiskTierClassifier(
+            custom_map={"custom:action": ApprovalRiskLevel.LOW}
+        )
+        # Default still works.
+        assert classifier.classify(ActionType.CODE_READ) == ApprovalRiskLevel.LOW
+        # Custom also works.
+        assert classifier.classify("custom:action") == ApprovalRiskLevel.LOW


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Search for timeout marker usage in test files fd -e py "tests/" -type f | head -20 | xargs grep -l "timeout" 2>/dev/null | head -10

Repository: Aureliolo/ai-company

Length of output: 234

🏁 Script executed:

# Look for pytest configuration files fd -e ini -e toml -e cfg "pytest|setup" -type f | head -10

Repository: Aureliolo/ai-company

Length of output: 234

🏁 Script executed:

# Search more broadly for timeout in all test files rg "pytest\.mark\.timeout|timeout.*30" tests/ -t py -A 2 -B 2

Repository: Aureliolo/ai-company

Length of output: 50376

🏁 Script executed:

# Check if the specific file has the timeout marker cat tests/unit/security/timeout/test_risk_tier_classifier.py | head -10

Repository: Aureliolo/ai-company

Length of output: 338

Add module-level timeout marker to align with test policy.

Add pytestmark = pytest.mark.timeout(30) after the imports. This is required for all unit test files per coding guidelines and is already consistently used throughout the test suite.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/security/timeout/test_risk_tier_classifier.py` around lines 1 - 64, Add the module-level pytest timeout marker by defining pytestmark = pytest.mark.timeout(30) immediately after the imports in this test module; update the top of the file (near the imports that include pytest and DefaultRiskTierClassifier) so the module-level marker applies to all tests (no changes needed to DefaultRiskTierClassifier or individual test functions).

gemini-code-assist · 2026-03-10T12:53:31Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the agent system by introducing robust autonomy management and approval timeout mechanisms. It allows for dynamic control over agent decision-making, ensuring that actions requiring human oversight are properly routed and handled within defined timeframes. The changes also include a new persistence layer for parking agent execution contexts, enabling seamless resumption of tasks after approvals. This foundational work improves system security, operational control, and agent reliability.

Highlights

Autonomy Levels: Introduced a three-level resolution chain (agent → department → company default) for agent autonomy, managed by AutonomyResolver and EffectiveAutonomy model. This includes action type expansion via ActionTypeRegistry and seniority constraints (JUNIOR agents cannot have FULL autonomy). A HumanOnlyPromotionStrategy is implemented for auto-downgrades.
Approval Timeout Policies: Added four pluggable TimeoutPolicy implementations: Wait Forever, Deny on Timeout, Tiered per risk level, and Escalation Chain. These are configured via a discriminated union, supported by a DefaultRiskTierClassifier and managed by a TimeoutChecker service.
Task Park/Resume: Implemented a ParkService to serialize AgentContext into ParkedContext for persistence when an agent's task is awaiting approval. A new PARKED termination reason was added to ExecutionResult.
Persistence: The SQLite database schema was migrated to version 3, adding a parked_contexts table with appropriate indexes. A SQLiteParkedContextRepository was implemented to provide full CRUD operations for parked contexts.
API Endpoint: A new AutonomyController was added and registered with the application router, providing GET and POST endpoints for managing agent autonomy levels at /agents/{agent_id}/autonomy.
Security Integration: Integrated an autonomy pre-check into SecOpsService to route actions based on effective autonomy levels before the full rule engine. This includes a security guard blocking auto-approval for HIGH/CRITICAL risk tiers.
Pre-PR Review Fixes: Addressed 51 findings from 10 agents, including fixes for the autonomy controller, disjoint validator on EffectiveAutonomy action sets, escalate_to consistency validator on TimeoutAction, MemoryError/RecursionError re-raises, _row_to_model error handling, renaming YamlRiskTierClassifier, moving Jinja2 environment to a module-level singleton, personality mutation patterns, immutability violations, and explicit column enumeration in SELECT queries.

Changelog

src/ai_company/api/controllers/init.py
- Imported AutonomyController.
- Registered AutonomyController for API routing.
src/ai_company/api/controllers/autonomy.py
- Added AutonomyController with GET/POST endpoints for managing agent autonomy levels.
src/ai_company/config/schema.py
- Imported AutonomyLevel, AutonomyConfig, and ApprovalTimeoutConfig.
- Added autonomy_level field to AgentConfig.
- Updated CompanyConfig to use AutonomyConfig and ApprovalTimeoutConfig.
- Included a model_validator for backward compatibility with float autonomy values.
src/ai_company/core/agent.py
- Imported AutonomyLevel.
- Added autonomy_level field to AgentIdentity.
src/ai_company/core/company.py
- Imported autonomy and timeout configuration models.
- Added autonomy_level to Department model.
- Updated CompanyConfig to use AutonomyConfig and ApprovalTimeoutConfig models.
- Included a backward compatibility validator for autonomy float values.
src/ai_company/core/enums.py
- Added new AutonomyLevel enum.
- Added new DowngradeReason enum.
- Added new TimeoutActionType enum.
src/ai_company/engine/agent_engine.py
- Imported EffectiveAutonomy.
- Modified run method to accept effective_autonomy parameter.
- Modified _prepare_context method to accept effective_autonomy parameter.
- Modified _make_security_interceptor method to accept effective_autonomy parameter.
- Modified _make_tool_invoker method to accept effective_autonomy parameter.
src/ai_company/engine/loop_protocol.py
- Added PARKED as a TerminationReason.
- Updated _validate_error_message to handle the PARKED termination reason.
src/ai_company/engine/prompt.py
- Imported EffectiveAutonomy.
- Modified build_system_prompt to pass effective_autonomy.
- Modified _build_core_context to pass effective_autonomy.
- Modified _build_template_context to pass effective_autonomy.
- Modified _trim_sections to pass effective_autonomy.
- Modified _render_and_estimate to pass effective_autonomy.
src/ai_company/engine/prompt_template.py
- Modified the Jinja2 prompt template to display effective_autonomy details.
src/ai_company/observability/events/autonomy.py
- Added constants for various autonomy-related events.
src/ai_company/observability/events/persistence.py
- Added new event constants for ParkedContext persistence operations.
src/ai_company/observability/events/timeout.py
- Added constants for various approval timeout events.
src/ai_company/persistence/protocol.py
- Added parked_contexts property to PersistenceBackend protocol.
src/ai_company/persistence/repositories.py
- Added ParkedContextRepository protocol definition.
src/ai_company/persistence/sqlite/backend.py
- Imported SQLiteParkedContextRepository.
- Initialized and exposed _parked_contexts property in SQLitePersistenceBackend.
src/ai_company/persistence/sqlite/migrations.py
- Bumped SCHEMA_VERSION to 3.
- Added _V3_STATEMENTS to create the parked_contexts table and its indexes.
- Included _apply_v3 in _MIGRATIONS.
src/ai_company/persistence/sqlite/parked_context_repo.py
- Implemented SQLiteParkedContextRepository for CRUD operations on ParkedContext objects.
src/ai_company/security/autonomy/init.py
- Defined __all__ for the new autonomy modules.
src/ai_company/security/autonomy/change_strategy.py
- Implemented HumanOnlyPromotionStrategy to manage autonomy level changes.
src/ai_company/security/autonomy/models.py
- Defined data models for AutonomyPreset, AutonomyConfig, EffectiveAutonomy, and AutonomyOverride.
src/ai_company/security/autonomy/protocol.py
- Defined AutonomyChangeStrategy protocol.
src/ai_company/security/autonomy/resolver.py
- Implemented AutonomyResolver to determine effective autonomy and expand action type patterns.
src/ai_company/security/service.py
- Imported autonomy events and EffectiveAutonomy.
- Added effective_autonomy to SecOpsService constructor.
- Implemented _apply_autonomy_precheck to route actions based on autonomy.
- Added MemoryError/RecursionError re-raises in audit and escalation handling.
src/ai_company/security/timeout/init.py
- Defined __all__ for the new timeout modules.
src/ai_company/security/timeout/config.py
- Defined a discriminated union ApprovalTimeoutConfig for four policy types.
src/ai_company/security/timeout/factory.py
- Implemented create_timeout_policy function to instantiate TimeoutPolicy objects.
src/ai_company/security/timeout/models.py
- Defined TimeoutAction model, including validation for escalate_to.
src/ai_company/security/timeout/park_service.py
- Implemented ParkService to serialize/deserialize AgentContext to ParkedContext.
src/ai_company/security/timeout/parked_context.py
- Defined ParkedContext model to store a snapshot of a parked agent execution.
src/ai_company/security/timeout/policies.py
- Implemented four TimeoutPolicy types: WaitForeverPolicy, DenyOnTimeoutPolicy, TieredTimeoutPolicy, and EscalationChainPolicy.
src/ai_company/security/timeout/protocol.py
- Defined TimeoutPolicy and RiskTierClassifier protocols.
src/ai_company/security/timeout/risk_tier_classifier.py
- Implemented DefaultRiskTierClassifier to map action types to risk tiers.
src/ai_company/security/timeout/timeout_checker.py
- Implemented TimeoutChecker service to evaluate pending approval items.
src/ai_company/templates/renderer.py
- Moved the Jinja2 environment to a module-level singleton.
- Updated _extract_numeric_config to handle autonomy as float or dictionary.
- Refactored _resolve_agent_personality to return personality dictionary instead of mutating agent_dict.
tests/unit/api/conftest.py
- Imported ParkedContext.
- Added FakeParkedContextRepository to FakePersistenceBackend.
tests/unit/api/controllers/test_autonomy.py
- Added unit tests for AutonomyController GET and POST endpoints.
tests/unit/core/conftest.py
- Imported AutonomyConfig.
- Updated CompanyConfigFactory to include autonomy.
- Updated CompanyFactory to include config.
tests/unit/core/test_company.py
- Imported AutonomyLevel, DenyOnTimeoutConfig, WaitForeverConfig.
- Updated TestCompanyConfig to test autonomy as AutonomyConfig and its backward compatibility.
- Added TestCompanyConfigApprovalTimeout for new timeout configurations.
tests/unit/engine/test_loop_protocol.py
- Updated tests to reflect the addition of PARKED to TerminationReason.
tests/unit/engine/test_prompt.py
- Imported AutonomyLevel and EffectiveAutonomy.
- Added TestEffectiveAutonomyInPrompt to verify autonomy details in system prompts.
tests/unit/observability/test_events.py
- Updated test_all_domain_modules_discovered to include new autonomy and timeout event modules.
- Added test_autonomy_events_exist to verify new autonomy event constants.
- Added test_timeout_events_exist to verify new timeout event constants.
- Added test_parked_context_persistence_events_exist to verify new parked context persistence event constants.
tests/unit/persistence/sqlite/test_migrations.py
- Added tests test_v3_creates_parked_contexts_table and test_v3_creates_parked_context_indexes.
tests/unit/persistence/sqlite/test_parked_context_repo.py
- Added comprehensive unit tests for SQLiteParkedContextRepository.
tests/unit/persistence/test_migrations_v2.py
- Updated test_schema_version_is_two to test_schema_version_is_three and adjusted expected schema version.
tests/unit/persistence/test_protocol.py
- Imported ParkedContext.
- Added _FakeParkedContextRepository to the fake persistence backend.
tests/unit/security/autonomy/test_change_strategy.py
- Added unit tests for HumanOnlyPromotionStrategy.
tests/unit/security/autonomy/test_models.py
- Added unit tests for autonomy-related data models.
tests/unit/security/autonomy/test_resolver.py
- Added unit tests for AutonomyResolver.
tests/unit/security/test_service.py
- Imported AutonomyLevel and EffectiveAutonomy.
- Added TestAutonomyPrecheck to verify the new autonomy pre-check logic in SecOpsService.
tests/unit/security/timeout/test_config.py
- Added unit tests for ApprovalTimeoutConfig and its constituent models.
tests/unit/security/timeout/test_factory.py
- Added unit tests for create_timeout_policy factory function.
tests/unit/security/timeout/test_park_service.py
- Added unit tests for ParkService.
tests/unit/security/timeout/test_parked_context.py
- Added unit tests for the ParkedContext data model.
tests/unit/security/timeout/test_policies.py
- Added unit tests for WaitForeverPolicy, DenyOnTimeoutPolicy, TieredTimeoutPolicy, and EscalationChainPolicy.
tests/unit/security/timeout/test_risk_tier_classifier.py
- Added unit tests for DefaultRiskTierClassifier.
tests/unit/security/timeout/test_timeout_checker.py
- Added unit tests for TimeoutChecker.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces significant new features for agent autonomy and approval timeouts. The changes are extensive, adding new models, services, persistence layers, and API endpoints. The implementation of autonomy levels with a three-tier resolution chain and pluggable timeout policies is well-structured. The code also includes several important fixes, such as preventing MemoryError from being caught in broad exception blocks and improving the template rendering performance.

My review focuses on a few areas for improvement. The new AutonomyController currently uses the company-wide default autonomy level instead of resolving the agent-specific level, which is a correctness issue. I've also pointed out a couple of smaller maintainability improvements related to magic numbers and logging clarity. Overall, this is a substantial and well-executed feature addition.

_{Note: Security Review did not run due to the size of the PR.}

gemini-code-assist · 2026-03-10T12:57:19Z

src/ai_company/api/controllers/autonomy.py

+        """
+        app_state: AppState = state.app_state
+        config = app_state.config.config
+        level = config.autonomy.level


This implementation returns the company-wide default autonomy level, not the agent-specific one. The endpoint GET /agents/{agent_id}/autonomy implies it should return the level for the specified agent_id. This could be misleading for clients of the API. Consider fetching the agent's data and using the AutonomyResolver to determine their effective autonomy level.

gemini-code-assist · 2026-03-10T12:57:19Z

src/ai_company/api/controllers/autonomy.py

+        """
+        app_state: AppState = state.app_state
+        config = app_state.config.config
+        current_level = config.autonomy.level


Similar to get_autonomy, this method uses the company-wide default autonomy level as the current_level. This is likely incorrect as it doesn't reflect the agent's actual current autonomy level, which might have overrides. This could lead to incorrect logging and behavior. You should resolve the agent-specific autonomy level here as well.

gemini-code-assist · 2026-03-10T12:57:19Z

src/ai_company/core/company.py

+    Thresholds: 0.0-0.24 → locked, 0.25-0.49 → supervised,
+    0.5-0.79 → semi, 0.8-1.0 → full.
+    """
+    if value < 0.25:  # noqa: PLR2004


The thresholds 0.25, 0.5, and 0.8 used for converting a float to an AutonomyLevel are magic numbers. To improve readability and maintainability, consider defining them as named constants at the module level, for example _LOCKED_THRESHOLD = 0.25.

gemini-code-assist · 2026-03-10T12:57:19Z

src/ai_company/persistence/sqlite/parked_context_repo.py

+        except (sqlite3.Error, aiosqlite.Error) as exc:
+            msg = f"Failed to delete parked context {parked_id!r}"
+            logger.exception(
+                PERSISTENCE_PARKED_CONTEXT_QUERY_FAILED,


The log event PERSISTENCE_PARKED_CONTEXT_QUERY_FAILED is used here for a delete operation failure. This is misleading for anyone debugging persistence issues. It would be better to use a more specific event like PERSISTENCE_PARKED_CONTEXT_DELETE_FAILED. You may need to define this new event constant.

Implement four autonomy levels (full/semi/supervised/locked) with three-level resolution chain (agent→department→company), per-action classification with category expansion, seniority validation, and runtime changes via pluggable strategy. Add four timeout policies (wait-forever/deny/tiered/escalation-chain) with risk tier classification, parked context persistence, and timeout checker. - Phase 1: AutonomyLevel/DowngradeReason enums, AutonomyPreset, AutonomyConfig, EffectiveAutonomy, AutonomyResolver, HumanOnlyPromotionStrategy, AutonomyChangeStrategy protocol - Phase 2: CompanyConfig.autonomy float→AutonomyConfig migration, Department.autonomy_level, AgentIdentity.autonomy_level - Phase 3: SecOpsService autonomy pre-check (auto-approve/escalate), AgentEngine effective_autonomy param, AutonomyController REST API - Phase 4: Effective autonomy section in system prompt template - Phase 5: TimeoutActionType enum, TimeoutPolicy protocol, four policy implementations, discriminated union config, factory - Phase 6: ParkedContext model, ParkedContextRepository protocol, SQLite implementation, v3 migration - Phase 7: CompanyConfig.approval_timeout field - Phase 8: ParkService park/resume, TimeoutChecker, PARKED termination reason Closes #42, Closes #126

Pre-reviewed by 10 agents, 51 findings addressed: - Fix autonomy controller returning requested level instead of current - Add disjoint validator on EffectiveAutonomy action sets - Add escalate_to consistency validator on TimeoutAction - Add seniority constraint enforcement in AutonomyResolver - Add MemoryError/RecursionError re-raise in security service - Fix _row_to_model to raise QueryError instead of returning None - Rename YamlRiskTierClassifier to DefaultRiskTierClassifier - Move Jinja2 env to module-level singleton in renderer - Fix personality mutation pattern (return instead of mutate) - Add security guard blocking auto-approve for HIGH/CRITICAL risk - Fix immutability violations (deepcopy metadata, immutable dicts) - Enumerate columns explicitly in SELECT queries - Register AutonomyController with app router - Add comprehensive tests for new code paths

…dates - Fix circular import in security/autonomy/__init__.py (removed eager AutonomyResolver import that caused core→security→core cycle) - Fix CompanyConfigFactory to pin approval_timeout=WaitForeverConfig() - Update prompt template version assertions to 1.4.0 - Fix test_non_pending_item_raises to supply decided_at/decided_by - Replace assert with restructured control flow in TieredTimeoutPolicy - Update DESIGN_SPEC.md, CLAUDE.md, README.md for autonomy/timeout docs

greptile-apps · 2026-03-10T13:37:24Z

src/ai_company/security/service.py

+            except MemoryError, RecursionError:
+                raise


Same Python 2 except syntax error (second occurrence)

Suggested change

except MemoryError, RecursionError:

raise

except (MemoryError, RecursionError):

Prompt To Fix With AI

This is a comment left during a code review. Path: src/ai_company/security/service.py Line: 239-240 Comment: **Same Python 2 `except` syntax error (second occurrence)** ```suggestion except (MemoryError, RecursionError): ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-10T13:37:25Z

src/ai_company/security/service.py

+        except MemoryError, RecursionError:
+            raise


Same Python 2 except syntax error (third occurrence)

Suggested change

except MemoryError, RecursionError:

raise

except (MemoryError, RecursionError):

Prompt To Fix With AI

This is a comment left during a code review. Path: src/ai_company/security/service.py Line: 338-339 Comment: **Same Python 2 `except` syntax error (third occurrence)** ```suggestion except (MemoryError, RecursionError): ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-10T13:37:26Z

src/ai_company/security/service.py

+        except MemoryError, RecursionError:
+            raise


Same Python 2 except syntax error (fourth occurrence)

Suggested change

except MemoryError, RecursionError:

raise

except (MemoryError, RecursionError):

Prompt To Fix With AI

This is a comment left during a code review. Path: src/ai_company/security/service.py Line: 390-391 Comment: **Same Python 2 `except` syntax error (fourth occurrence)** ```suggestion except (MemoryError, RecursionError): ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-10T13:37:27Z

src/ai_company/security/timeout/timeout_checker.py

+        except MemoryError, RecursionError:
+            raise


Same Python 2 except syntax error in timeout_checker.py

Suggested change

except MemoryError, RecursionError:

raise

except (MemoryError, RecursionError):

Prompt To Fix With AI

This is a comment left during a code review. Path: src/ai_company/security/timeout/timeout_checker.py Line: 65-66 Comment: **Same Python 2 `except` syntax error in `timeout_checker.py`** ```suggestion except (MemoryError, RecursionError): ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-10T13:37:28Z

src/ai_company/security/timeout/policies.py

+            if elapsed_seconds < step_end:
+                if idx == 0:
+                    # First step hasn't timed out yet — WAIT.
+                    logger.debug(
+                        TIMEOUT_WAITING,
+                        approval_id=item.id,
+                        escalation_role=step.role,
+                        elapsed_seconds=elapsed_seconds,
+                    )
+                    return TimeoutAction(
+                        action=TimeoutActionType.WAIT,
+                        reason=(
+                            f"Waiting at {step.role!r} — "
+                            f"{elapsed_seconds:.0f}s of "
+                            f"{step_end:.0f}s elapsed"
+                        ),
+                    )
+                # Previous step timed out — escalate to this step's role.
+                logger.info(
+                    TIMEOUT_ESCALATED,
+                    approval_id=item.id,
+                    escalation_role=step.role,
+                    elapsed_seconds=elapsed_seconds,
+                )
+                return TimeoutAction(
+                    action=TimeoutActionType.ESCALATE,
+                    reason=(
+                        f"Escalated to {step.role!r} — {elapsed_seconds:.0f}s elapsed"
+                    ),
+                    escalate_to=step.role,


EscalationChainPolicy escalates to the wrong role — off-by-one in chain resolution

When step idx-1's timeout expires and the loop reaches idx, the code escalates to step[idx].role (the current step's role). But the semantics of the chain are that expiring step N should escalate to step N's role, not step N+1's role. The current code always skips step[0].role as an escalation target entirely.

Consider a two-step chain [team_lead(10 min), manager(20 min)]:

elapsed Expected Actual

0–10 min WAIT WAIT ✓

10–30 min ESCALATE → team_lead ESCALATE → manager ✗

30+ min ESCALATE → manager / exhausted exhausted ✗

For a single-step chain [cto(60 min)], cto is never escalated to at all — the policy immediately returns on_chain_exhausted after 60 minutes without ever issuing a ESCALATE action.

The fix is to use the previous step's role when deciding where to escalate:

for idx, step in enumerate(self._chain): step_timeout = step.timeout_minutes * _SECONDS_PER_MINUTE step_end = cumulative_seconds + step_timeout if elapsed_seconds < step_end: if idx == 0: # Waiting for the initial reviewer — no escalation yet. return TimeoutAction( action=TimeoutActionType.WAIT, reason=..., ) # Previous step (chain[idx-1]) timed out — escalate to that step's role. prev_step = self._chain[idx - 1] return TimeoutAction( action=TimeoutActionType.ESCALATE, reason=f"Escalated to {prev_step.role!r} ...", escalate_to=prev_step.role, ) cumulative_seconds += step_timeout # Chain exhausted — escalate to the last step's role before on_chain_exhausted, # or apply on_chain_exhausted directly depending on the design intent.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/ai_company/security/timeout/policies.py Line: 285-314 Comment: **`EscalationChainPolicy` escalates to the wrong role — off-by-one in chain resolution** When step `idx-1`'s timeout expires and the loop reaches `idx`, the code escalates to `step[idx].role` (the *current* step's role). But the semantics of the chain are that expiring step N should escalate *to* step N's role, not step N+1's role. The current code always skips step[0].role as an escalation target entirely. Consider a two-step chain `[team_lead(10 min), manager(20 min)]`: | elapsed | Expected | Actual | |---------|----------|--------| | 0–10 min | WAIT | WAIT ✓ | | 10–30 min | ESCALATE → `team_lead` | ESCALATE → `manager` ✗ | | 30+ min | ESCALATE → `manager` / exhausted | exhausted ✗ | For a **single-step** chain `[cto(60 min)]`, `cto` is never escalated to at all — the policy immediately returns `on_chain_exhausted` after 60 minutes without ever issuing a `ESCALATE` action. The fix is to use the **previous** step's role when deciding where to escalate: ```python for idx, step in enumerate(self._chain): step_timeout = step.timeout_minutes * _SECONDS_PER_MINUTE step_end = cumulative_seconds + step_timeout if elapsed_seconds < step_end: if idx == 0: # Waiting for the initial reviewer — no escalation yet. return TimeoutAction( action=TimeoutActionType.WAIT, reason=..., ) # Previous step (chain[idx-1]) timed out — escalate to that step's role. prev_step = self._chain[idx - 1] return TimeoutAction( action=TimeoutActionType.ESCALATE, reason=f"Escalated to {prev_step.role!r} ...", escalate_to=prev_step.role, ) cumulative_seconds += step_timeout # Chain exhausted — escalate to the last step's role before on_chain_exhausted, # or apply on_chain_exhausted directly depending on the design intent. ``` How can I resolve this? If you propose a fix, please make it concise.

🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 10, 2026 11:59

Copilot started reviewing on behalf of Aureliolo March 10, 2026 12:00 View session

greptile-apps bot reviewed Mar 10, 2026

View reviewed changes

src/ai_company/api/controllers/autonomy.py Show resolved Hide resolved

src/ai_company/security/service.py Outdated Show resolved Hide resolved

src/ai_company/security/autonomy/change_strategy.py Show resolved Hide resolved

Copilot AI reviewed Mar 10, 2026

View reviewed changes

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

gemini-code-assist bot reviewed Mar 10, 2026

View reviewed changes

Aureliolo added 4 commits March 10, 2026 14:28

fix: correct _trim_sections unpacking after rebase conflict resolution

8ce32f9

Aureliolo force-pushed the feat/autonomy-levels-and-approval-timeout branch from 1855156 to 4deae2a Compare March 10, 2026 13:30

Aureliolo merged commit eecc25a into main Mar 10, 2026
8 of 9 checks passed

Aureliolo deleted the feat/autonomy-levels-and-approval-timeout branch March 10, 2026 13:30

greptile-apps bot reviewed Mar 10, 2026

View reviewed changes

coderabbitai bot mentioned this pull request Mar 10, 2026

feat: implement AuditRepository for security audit log persistence #279

Merged

6 tasks

Aureliolo mentioned this pull request Mar 10, 2026

chore(main): release ai-company 0.1.1 #282

Merged

Aureliolo mentioned this pull request Mar 10, 2026

chore(main): release 0.1.0 #283

Merged

coderabbitai bot mentioned this pull request Mar 14, 2026

feat: add approval workflow gates to TaskEngine #387

Merged

11 tasks

This was referenced Mar 15, 2026

chore(main): release 0.2.4 #431

Merged

chore(main): release 0.2.0 #442

Closed

chore(main): release 0.2.5 #447

Merged

chore(main): release 0.2.0 #460

Closed

chore(main): release 0.2.0 #471

Closed

coderabbitai bot mentioned this pull request Mar 17, 2026

feat: migrate config consumers to read through SettingsService #510

Merged

7 tasks

	except MemoryError, RecursionError:
	raise
	except (MemoryError, RecursionError):

elapsed	Expected	Actual
0–10 min	WAIT	WAIT ✓
10–30 min	ESCALATE → `team_lead`	ESCALATE → `manager` ✗
30+ min	ESCALATE → `manager` / exhausted	exhausted ✗

Conversation

Aureliolo commented Mar 10, 2026

Summary

Pre-PR review fixes (51 findings from 10 agents)

Test plan

Uh oh!

github-actions bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

coderabbitai bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Uh oh!

greptile-apps bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 1/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (2)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot commented Mar 10, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot commented Mar 10, 2026 •

edited

Loading

coderabbitai bot commented Mar 10, 2026 •

edited

Loading

greptile-apps bot commented Mar 10, 2026 •

edited

Loading