feat: add approval workflow gates to TaskEngine#368
Conversation
Implement the approval gate layer that parks agent execution when human approval is required and provides infrastructure for resuming after a decision. Also adds the `request_human_approval` tool for agents to explicitly request approval. - Add EscalationInfo/ResumePayload models and approval gate events - Add ApprovalGate service (park/resume via ParkService) - Add RequestHumanApprovalTool (creates ApprovalItem, signals parking) - Add ToolInvoker escalation tracking (ESCALATE verdicts + tool metadata) - Integrate approval gate into execute_tool_calls and both loop types - Wire into AgentEngine, AppState, and approvals controller resume path Closes #258 Closes #259
Critical: empty task_id ValidationError, parking metadata bypass, PARKED→ERROR on park failure, wire parked_context_repo, TOCTOU race via save_if_pending, resume trigger wiring in approve/reject. Security: prompt injection mitigation, max_length on comment, ttl_seconds cap, action_type format validation, remove phantom requested_by field. Logging: wrong event constants for init/risk/park-info, consistent MemoryError logging, new INITIALIZED/RISK_CLASSIFIED/RESUME_TRIGGERED constants. Code quality: extract security/tool factories (962→860 lines), split execute() into helpers, deterministic escalation order, boolean parking_failed, check delete() result, risk classifier error handling. Frontend: async handler type fix, WS test payload/await fix, filter-aware WS, 404 vs transient error discrimination, total desync fix, empty catch logging. Tests: park failure expects ERROR, module-level markers, extracted fixtures, simplified PropertyMock, updated resume message assertions. Docs: persistence qualifier, CLAUDE.md events, docstring fixes.
- _trigger_resume no longer calls resume_context() which would delete the parked record before a scheduler can consume it — now only logs the resume trigger for scheduler observation - Remove dead APPROVAL_RESUMED WsEventType (never emitted) - Remove unused ApprovalGate import from controller
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds an approval-gate feature: new ApprovalGate service and models, request_human_approval tool, ToolInvoker escalation tracking, loop/engine wiring to park/resume agent contexts (PARKED path), API/DTO/controller/store updates, DB migration to nullable parked task_id, observability events, and many tests and frontend WS/store updates. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent as Agent/Loop
participant TI as ToolInvoker
participant SO as SecOps
participant AG as ApprovalGate
participant PS as ParkService
participant Repo as ParkedContextRepo
participant AC as ApprovalController
Agent->>TI: invoke(tool_call)
TI->>SO: check_security(tool_call)
alt ESCALATE with approval_id
SO-->>TI: ESCALATE (approval_id, risk)
TI->>TI: record pending_escalation
TI-->>Agent: return pending_escalations
Agent->>AG: should_park(escalations)
AG-->>Agent: escalation_info
Agent->>AG: park_context(escalation, context, agent_id, task_id)
AG->>PS: serialize context + metadata
PS-->>AG: serialized data
AG->>Repo: save parked context
Repo-->>AG: parked_context
AG-->>Agent: success -> PARKED result
else
SO-->>TI: ALLOW / DENY
TI->>Agent: tool result (no park)
end
rect rgba(200,150,255,0.5)
Note over AC,Repo: Later — human decision triggers resume
AC->>AC: approve/reject(approval_id)
AC->>AG: resume_context(approval_id)
AG->>Repo: load by approval_id
Repo-->>AG: parked_context
AG->>PS: deserialize -> AgentContext
AG->>Repo: delete parked record
AG-->>AC: (AgentContext, parked_id)
AC->>Agent: inject resume message / continue execution
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
✨ Simplify code
📝 Coding Plan
Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a robust human approval workflow into the system, enabling agents to pause their operations and await human review for critical or sensitive tasks. This involves deep integration across the core engine, API, and tools, ensuring that agent contexts can be reliably parked, persisted, and resumed. The changes also include significant security enhancements for API interactions and crucial fixes for real-time frontend updates, making the approval process transparent and secure. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Preview DeploymentURL: https://pr-368.synthorg-pr-preview.pages.dev Built from commit |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/api/test_dto.py (1)
36-77: 🧹 Nitpick | 🔵 TrivialLGTM with a suggestion.
Tests correctly updated to use the new
category:actionformat ("deploy:release"). Consider adding test cases for invalidaction_typeformats to ensure the validator rejects malformed inputs.,
💡 Suggested additional test cases for action_type validation
`@pytest.mark.parametrize`( "invalid_action_type", [ "deploy", # missing colon ":release", # empty category "deploy:", # empty action "deploy: ", # whitespace-only action " :release", # whitespace-only category "a:b:c", # too many parts ], ) def test_action_type_invalid_format(self, invalid_action_type: str) -> None: with pytest.raises(ValueError, match="category:action"): CreateApprovalRequest( action_type=invalid_action_type, title="Test", description="Test desc", risk_level=ApprovalRiskLevel.LOW, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/api/test_dto.py` around lines 36 - 77, Add new parametrized tests to cover invalid action_type formats: create a test named test_action_type_invalid_format that parametrizes invalid_action_type values (e.g. "deploy", ":release", "deploy:", "deploy: ", " :release", "a:b:c") and asserts CreateApprovalRequest raises ValueError with a message matching "category:action"; locate where other DTO tests live and follow the same pattern/fixtures as test_metadata_* functions so you assert CreateApprovalRequest(action_type=invalid_action_type, title="Test", description="Test desc", risk_level=ApprovalRiskLevel.LOW) raises the expected ValueError.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/api/controllers/approvals.py`:
- Around line 298-301: Before raising UnauthorizedError in the auth check, log
the unauthorized path with context: capture request.scope.get("user")
(auth_user) and relevant request context (e.g., request.method or request.url)
and call processLogger.warning or processLogger.error with a clear message and
the auth_user value, then raise UnauthorizedError as currently done; update the
block around auth_user = request.scope.get("user") to emit that log before
raising to satisfy the "log error paths" guideline.
- Around line 171-205: The resume flow currently only logs
APPPROVAL_GATE_RESUME_TRIGGERED and never wakes parked approvals; update
_trigger_resume to call the approval gate's resume logic (use
approval_gate.resume_context or the appropriate resume method on
AppState.approval_gate) after logging, passing approval_id, approved, decided_by
and decision_reason, and ensure the call is non-blocking (e.g. schedule via
asyncio.create_task or other background executor) and properly error-handled
with logger.info/logger.error on success/failure so parked contexts are actually
resumed without consuming the approval record.
In `@src/ai_company/engine/approval_gate.py`:
- Around line 213-232: The code only logs when deletion fails or returns False;
add an explicit success log for observability by emitting a debug/info message
when deleted is True after calling self._parked_context_repo.delete(parked.id).
Locate the block around the deleted variable and the call to
_parked_context_repo.delete in the resume path (the try/except that currently
catches exceptions and logs APPROVAL_GATE_RESUME_DELETE_FAILED), and add a
logger.debug (or logger.info) that references approval_id and parked.id when
deleted is True so successful deletions are recorded alongside the existing
warning/exception logs.
In `@src/ai_company/engine/loop_helpers.py`:
- Around line 373-378: The debug log in logger.debug currently uses the
misleading APPROVAL_GATE_CONTEXT_PARKED event constant when noting a missing
task_execution; replace that constant with a semantically appropriate event
(either a generic debug event constant or a new specific constant) such as
APPROVAL_GATE_CONTEXT_MISSING_TASK_EXECUTION (or GENERIC_DEBUG_EVENT) and update
the call in the same logger.debug invocation that includes
escalation.approval_id and agent_id so the log accurately reflects "missing
task_execution / task_id will be None" rather than implying a parked approval.
In `@tests/unit/engine/test_loop_helpers_approval.py`:
- Line 42: Replace the hardcoded model identifier "test-model-001" with the
repository-standard test model id (e.g., "test-small-001") wherever it appears
in the test case; update the value passed to the model parameter in the test
helper/fixture invocation that contains model="test-model-001" so the test uses
the approved naming pattern (test-small-001) consistent with other tests and
fixtures.
In `@web/src/__tests__/stores/approvals.test.ts`:
- Around line 281-290: The test spies on axios.isAxiosError (via vi.spyOn((await
import('axios')).default, 'isAxiosError')) but never restores it; add an
afterEach hook that calls mockRestore() on that spy (or reassigns the
originalIsAxiosError) so the original implementation stored in
originalIsAxiosError is reinstated after each test; locate the spy created near
axiosError/originalIsAxiosError and ensure you call spy.mockRestore() in
afterEach to prevent leakage into other tests.
In `@web/src/stores/approvals.ts`:
- Around line 79-85: When handling an approval.submitted event, don't
unconditionally increment total.value when activeFilters.value is truthy;
instead fetch the new approval via approvalsApi.getApproval(approvalId) and
check whether it matches the current activeFilters (or re-run the active query
using listApprovals(activeFilters.value)) before mutating approvals.value and
total.value. Update the handler that calls approvalsApi.getApproval(approvalId)
to either (a) apply the same filter predicate used by listApprovals to the
returned item and only prepend to approvals.value and increment total.value if
it matches, or (b) call listApprovals(activeFilters.value) and set
approvals.value and total.value from that refreshed result so the filtered total
stays correct.
- Around line 96-112: The status-change branch must reconcile the active query
instead of assuming the changed item is in approvals.value: after handling
'approval.approved'|'approval.rejected'|'approval.expired' (where you currently
call approvalsApi.getApproval and update approvals.value/total.value), re-run
the current list query to refresh approvals.value and total.value (e.g., call
the module's existing loader like fetchApprovals() or invoke approvalsApi.list
with the current filters/page) so filtered views reflect removals, additions,
and updated rows; keep the 404/410 fallback only as a secondary removal step if
the item truly no longer exists.
---
Outside diff comments:
In `@tests/unit/api/test_dto.py`:
- Around line 36-77: Add new parametrized tests to cover invalid action_type
formats: create a test named test_action_type_invalid_format that parametrizes
invalid_action_type values (e.g. "deploy", ":release", "deploy:", "deploy: ", "
:release", "a:b:c") and asserts CreateApprovalRequest raises ValueError with a
message matching "category:action"; locate where other DTO tests live and follow
the same pattern/fixtures as test_metadata_* functions so you assert
CreateApprovalRequest(action_type=invalid_action_type, title="Test",
description="Test desc", risk_level=ApprovalRiskLevel.LOW) raises the expected
ValueError.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 665c63f1-0f43-4e10-8243-1b719f49b678
📒 Files selected for processing (32)
CLAUDE.mdREADME.mddocs/design/engine.mdsrc/ai_company/api/approval_store.pysrc/ai_company/api/controllers/approvals.pysrc/ai_company/api/dto.pysrc/ai_company/api/state.pysrc/ai_company/engine/__init__.pysrc/ai_company/engine/_security_factory.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/engine/approval_gate.pysrc/ai_company/engine/approval_gate_models.pysrc/ai_company/engine/loop_helpers.pysrc/ai_company/engine/plan_execute_loop.pysrc/ai_company/engine/react_loop.pysrc/ai_company/observability/events/approval_gate.pysrc/ai_company/security/timeout/park_service.pysrc/ai_company/security/timeout/parked_context.pysrc/ai_company/tools/__init__.pysrc/ai_company/tools/approval_tool.pysrc/ai_company/tools/invoker.pysrc/ai_company/tools/registry.pytests/unit/api/controllers/test_approvals.pytests/unit/api/test_dto.pytests/unit/engine/test_approval_gate.pytests/unit/engine/test_approval_gate_models.pytests/unit/engine/test_loop_helpers_approval.pytests/unit/observability/test_events.pytests/unit/tools/test_approval_tool.pytests/unit/tools/test_invoker_escalation.pyweb/src/__tests__/stores/approvals.test.tsweb/src/stores/approvals.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Agent
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Greptile Review
- GitHub Check: Test (Python 3.14)
- GitHub Check: Build Backend
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (9)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Python version 3.14+ with PEP 649 native lazy annotations required.
Do NOT usefrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax:except A, B:(no parentheses) — ruff enforces this on Python 3.14.
Line length must be 88 characters, enforced by ruff.
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:example-provider,example-large-001,example-medium-001,example-small-001. Tests must usetest-provider,test-small-001, etc.
Files:
tests/unit/engine/test_loop_helpers_approval.pytests/unit/tools/test_approval_tool.pytests/unit/api/test_dto.pysrc/ai_company/engine/plan_execute_loop.pysrc/ai_company/engine/approval_gate.pytests/unit/api/controllers/test_approvals.pytests/unit/engine/test_approval_gate_models.pysrc/ai_company/engine/_security_factory.pysrc/ai_company/security/timeout/parked_context.pytests/unit/engine/test_approval_gate.pysrc/ai_company/observability/events/approval_gate.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/api/state.pysrc/ai_company/tools/registry.pytests/unit/tools/test_invoker_escalation.pysrc/ai_company/engine/loop_helpers.pysrc/ai_company/security/timeout/park_service.pysrc/ai_company/tools/__init__.pysrc/ai_company/engine/approval_gate_models.pysrc/ai_company/tools/invoker.pysrc/ai_company/api/controllers/approvals.pysrc/ai_company/tools/approval_tool.pysrc/ai_company/engine/react_loop.pysrc/ai_company/api/dto.pytests/unit/observability/test_events.pysrc/ai_company/api/approval_store.pysrc/ai_company/engine/__init__.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Tests must use markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Test coverage minimum: 80% (enforced in CI).
Async tests: useasyncio_mode = "auto"— no manual@pytest.mark.asyncioneeded.
Test timeout: 30 seconds per test.
Prefer@pytest.mark.parametrizefor testing similar cases.
Files:
tests/unit/engine/test_loop_helpers_approval.pytests/unit/tools/test_approval_tool.pytests/unit/api/test_dto.pytests/unit/api/controllers/test_approvals.pytests/unit/engine/test_approval_gate_models.pytests/unit/engine/test_approval_gate.pytests/unit/tools/test_invoker_escalation.pytests/unit/observability/test_events.py
docs/design/**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
When approved deviations occur, update the relevant
docs/design/page to reflect the new reality.
Files:
docs/design/engine.md
docs/**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Documentation source in
docs/built with Zensical. Design spec indocs/design/(7 pages). Architecture indocs/architecture/. Roadmap indocs/roadmap/. Security indocs/security.md.
Files:
docs/design/engine.md
web/src/**/*.{vue,ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Web dashboard: Vue 3 + PrimeVue + Tailwind CSS, organized by feature in src/components/, src/stores/, src/views/. Enforce with ESLint and vue-tsc type-checking.
Files:
web/src/__tests__/stores/approvals.test.tsweb/src/stores/approvals.ts
web/src/__tests__/**/*.ts
📄 CodeRabbit inference engine (CLAUDE.md)
Web dashboard tests: Vitest unit tests organized by feature in tests/
Files:
web/src/__tests__/stores/approvals.test.ts
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: All public functions must have type hints. Enforce via mypy strict mode.
Google-style docstrings required on public classes and functions, enforced by ruff D rules.
Create new objects instead of mutating existing ones. For non-Pydantic internal collections, usecopy.deepcopy()at construction plusMappingProxyTypewrapping for read-only enforcement.
Fordict/listfields in frozen Pydantic models, rely onfrozen=Truefor field reassignment prevention and usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 withBaseModel,model_validator,computed_field,ConfigDict. Use@computed_fieldfor derived values instead of storing redundant fields (e.g.,TokenUsage.total_tokens).
UseNotBlankStrfromcore.typesfor all identifier/name fields in Pydantic models, including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants, instead of manual whitespace validators.
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task.
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly, never silently swallow exceptions.
Validate at system boundaries: user input, external APIs, config files.
NEVER useimport logging,logging.getLogger(), orprint()in application code.
Always useloggeras the variable name for loggers (not_logger, notlog).
Event names must always use constants from domain-specific modules underai_company.observability.events(e.g.,PROVIDER_CALL_STARTfromevents.provider,BUDGET_RECORD_ADDEDfromevents.budget). Import dire...
Files:
src/ai_company/engine/plan_execute_loop.pysrc/ai_company/engine/approval_gate.pysrc/ai_company/engine/_security_factory.pysrc/ai_company/security/timeout/parked_context.pysrc/ai_company/observability/events/approval_gate.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/api/state.pysrc/ai_company/tools/registry.pysrc/ai_company/engine/loop_helpers.pysrc/ai_company/security/timeout/park_service.pysrc/ai_company/tools/__init__.pysrc/ai_company/engine/approval_gate_models.pysrc/ai_company/tools/invoker.pysrc/ai_company/api/controllers/approvals.pysrc/ai_company/tools/approval_tool.pysrc/ai_company/engine/react_loop.pysrc/ai_company/api/dto.pysrc/ai_company/api/approval_store.pysrc/ai_company/engine/__init__.py
src/ai_company/**/[!_]*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Every module with business logic MUST have:
from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Files:
src/ai_company/engine/plan_execute_loop.pysrc/ai_company/engine/approval_gate.pysrc/ai_company/security/timeout/parked_context.pysrc/ai_company/observability/events/approval_gate.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/api/state.pysrc/ai_company/tools/registry.pysrc/ai_company/engine/loop_helpers.pysrc/ai_company/security/timeout/park_service.pysrc/ai_company/engine/approval_gate_models.pysrc/ai_company/tools/invoker.pysrc/ai_company/api/controllers/approvals.pysrc/ai_company/tools/approval_tool.pysrc/ai_company/engine/react_loop.pysrc/ai_company/api/dto.pysrc/ai_company/api/approval_store.py
web/src/stores/**/*.ts
📄 CodeRabbit inference engine (CLAUDE.md)
Frontend state management: Pinia stores organized by feature in src/stores/
Files:
web/src/stores/approvals.ts
🧠 Learnings (14)
📚 Learning: 2026-03-13T21:03:58.907Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.907Z
Learning: Applies to web/src/__tests__/**/*.ts : Web dashboard tests: Vitest unit tests organized by feature in __tests__/
Applied to files:
web/src/__tests__/stores/approvals.test.ts
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Event names must always use constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/ai_company/observability/events/approval_gate.pysrc/ai_company/engine/agent_engine.pyCLAUDE.mdsrc/ai_company/engine/loop_helpers.py
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/ai_company/**/[!_]*.py : Every module with business logic MUST have: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`
Applied to files:
src/ai_company/api/state.pyCLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Structured logging: always use `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : All state transitions must log at INFO level.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : NEVER use `import logging`, `logging.getLogger()`, or `print()` in application code.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Always use `logger` as the variable name for loggers (not `_logger`, not `log`).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : DEBUG logging for object creation, internal flow, and entry/exit of key functions.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Pure data models, enums, and re-exports do NOT need logging.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Use Pydantic v2 with `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. Use `computed_field` for derived values instead of storing redundant fields (e.g., `TokenUsage.total_tokens`).
Applied to files:
src/ai_company/api/dto.py
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to tests/**/*.py : Tests must use markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`
Applied to files:
tests/unit/observability/test_events.py
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to tests/**/*.py : Test timeout: 30 seconds per test.
Applied to files:
tests/unit/observability/test_events.py
📚 Learning: 2026-03-13T21:03:58.907Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.907Z
Learning: Applies to web/src/stores/**/*.ts : Frontend state management: Pinia stores organized by feature in src/stores/
Applied to files:
web/src/stores/approvals.ts
🧬 Code graph analysis (19)
tests/unit/engine/test_loop_helpers_approval.py (5)
src/ai_company/engine/approval_gate.py (2)
ApprovalGate(40-274)should_park(71-90)src/ai_company/engine/approval_gate_models.py (1)
EscalationInfo(14-33)src/ai_company/engine/loop_helpers.py (1)
execute_tool_calls(241-343)src/ai_company/engine/loop_protocol.py (2)
ExecutionResult(79-140)TerminationReason(28-36)src/ai_company/providers/models.py (3)
CompletionResponse(257-306)ToolCall(96-119)ToolResult(122-135)
web/src/__tests__/stores/approvals.test.ts (2)
web/src/stores/approvals.ts (1)
useApprovalStore(8-136)web/src/api/types.ts (1)
WsEvent(519-524)
tests/unit/api/test_dto.py (2)
src/ai_company/core/enums.py (1)
ApprovalRiskLevel(443-449)src/ai_company/api/dto.py (1)
CreateApprovalRequest(196-249)
src/ai_company/engine/approval_gate.py (5)
src/ai_company/persistence/repositories.py (1)
ParkedContextRepository(199-267)src/ai_company/security/timeout/park_service.py (3)
ParkService(29-144)park(37-106)resume(108-144)src/ai_company/engine/approval_gate_models.py (1)
EscalationInfo(14-33)src/ai_company/engine/context.py (1)
AgentContext(87-307)src/ai_company/memory/errors.py (1)
MemoryError(13-14)
tests/unit/engine/test_approval_gate_models.py (1)
src/ai_company/engine/approval_gate_models.py (2)
EscalationInfo(14-33)ResumePayload(36-51)
src/ai_company/engine/_security_factory.py (7)
src/ai_company/security/rules/engine.py (1)
RuleEngine(32-174)src/ai_company/security/service.py (1)
SecOpsService(74-456)src/ai_company/core/agent.py (1)
AgentIdentity(266-342)src/ai_company/security/config.py (1)
SecurityConfig(93-140)src/ai_company/security/protocol.py (1)
SecurityInterceptionStrategy(18-54)src/ai_company/tools/registry.py (2)
ToolRegistry(30-126)all_tools(102-104)src/ai_company/tools/approval_tool.py (1)
RequestHumanApprovalTool(32-252)
src/ai_company/security/timeout/parked_context.py (2)
src/ai_company/engine/parallel_models.py (1)
task_id(87-89)src/ai_company/tools/base.py (1)
description(138-140)
tests/unit/engine/test_approval_gate.py (5)
src/ai_company/engine/approval_gate.py (5)
ApprovalGate(40-274)should_park(71-90)park_context(92-160)resume_context(162-239)build_resume_message(242-274)src/ai_company/engine/approval_gate_models.py (1)
EscalationInfo(14-33)src/ai_company/persistence/repositories.py (1)
ParkedContextRepository(199-267)src/ai_company/security/timeout/park_service.py (3)
ParkService(29-144)park(37-106)resume(108-144)src/ai_company/api/approval_store.py (1)
save(106-123)
src/ai_company/engine/agent_engine.py (6)
src/ai_company/engine/_security_factory.py (2)
make_security_interceptor(39-117)registry_with_approval_tool(120-148)src/ai_company/engine/approval_gate.py (1)
ApprovalGate(40-274)src/ai_company/persistence/repositories.py (1)
ParkedContextRepository(199-267)src/ai_company/engine/loop_protocol.py (1)
ExecutionLoop(151-189)src/ai_company/security/timeout/park_service.py (1)
ParkService(29-144)src/ai_company/engine/react_loop.py (1)
ReactLoop(55-269)
src/ai_company/api/state.py (1)
src/ai_company/engine/approval_gate.py (1)
ApprovalGate(40-274)
tests/unit/tools/test_invoker_escalation.py (6)
src/ai_company/providers/models.py (1)
ToolCall(96-119)src/ai_company/security/models.py (2)
SecurityVerdict(35-67)SecurityVerdictType(23-32)src/ai_company/tools/base.py (1)
ToolExecutionResult(25-54)src/ai_company/tools/invoker.py (4)
registry(123-125)pending_escalations(128-136)invoke(343-365)invoke_all(711-781)src/ai_company/tools/registry.py (1)
ToolRegistry(30-126)src/ai_company/engine/parallel_models.py (2)
agent_id(79-81)task_id(87-89)
src/ai_company/security/timeout/park_service.py (1)
src/ai_company/engine/parallel_models.py (1)
task_id(87-89)
src/ai_company/tools/__init__.py (1)
src/ai_company/tools/approval_tool.py (1)
RequestHumanApprovalTool(32-252)
src/ai_company/engine/approval_gate_models.py (1)
src/ai_company/tools/base.py (1)
action_type(133-135)
src/ai_company/api/controllers/approvals.py (4)
src/ai_company/api/state.py (1)
approval_gate(139-141)src/ai_company/api/approval_store.py (2)
get(61-73)save_if_pending(125-142)src/ai_company/api/auth/models.py (1)
AuthenticatedUser(68-87)src/ai_company/api/errors.py (2)
UnauthorizedError(59-65)ConflictError(41-47)
src/ai_company/engine/react_loop.py (2)
src/ai_company/api/state.py (1)
approval_gate(139-141)src/ai_company/engine/approval_gate.py (1)
ApprovalGate(40-274)
web/src/stores/approvals.ts (1)
web/src/api/types.ts (1)
WsEvent(519-524)
src/ai_company/api/approval_store.py (4)
src/ai_company/core/approval.py (1)
ApprovalItem(24-96)src/ai_company/core/enums.py (1)
ApprovalStatus(434-440)src/ai_company/api/app.py (1)
_on_expire(74-99)src/ai_company/memory/errors.py (1)
MemoryError(13-14)
src/ai_company/engine/__init__.py (3)
src/ai_company/api/state.py (1)
approval_gate(139-141)src/ai_company/engine/approval_gate.py (1)
ApprovalGate(40-274)src/ai_company/engine/approval_gate_models.py (2)
EscalationInfo(14-33)ResumePayload(36-51)
🪛 LanguageTool
CLAUDE.md
[style] ~154-~154: A comma is missing here.
Context: ...nder ai_company.observability.events (e.g. PROVIDER_CALL_START from `events.prov...
(EG_NO_COMMA)
🔇 Additional comments (46)
web/src/__tests__/stores/approvals.test.ts (3)
6-7: Good async test helper for WS microtask completion.This helper makes the sync
handleWsEvent+ async IIFE pattern testable and keeps assertions deterministic.
163-279: WS event coverage is strong and aligned withapproval_idpayload semantics.Nice expansion across submitted/approved/rejected/expired plus filter-active behavior and duplicate suppression.
341-366: Transient error path assertions are solid.Verifying unchanged state plus
console.warnpayload here is valuable for regression safety.web/src/stores/approvals.ts (1)
67-71: Good guard for malformed WS payloads.Early-returning unless
approval_idis a non-empty string keeps bad events from mutating store state.src/ai_company/tools/registry.py (1)
102-104: LGTM!The
all_tools()method follows the established pattern fromlist_tools()andto_definitions(), providing deterministic name-sorted iteration over tool instances. Clean implementation with proper type hints.docs/design/engine.md (1)
450-468: LGTM!The documentation accurately describes the new approval gate integration, including escalation detection via
ToolInvoker.pending_escalations, context serialization throughParkService, and thePARKEDtermination reason semantics. Clear and consistent with the implementation.src/ai_company/api/approval_store.py (2)
125-142: LGTM!The
save_if_pendingmethod correctly implements atomic conditional update semantics: it applies lazy expiration check before status comparison, preventing TOCTOU races where an item expires or gets decided between check and save. ReturnsNoneon both "not found" and "not pending" cases, which is appropriate for the caller to handle.
170-179: LGTM!Good defensive handling of the
on_expirecallback: re-raises critical errors (MemoryError,RecursionError) while logging and swallowing other exceptions to prevent callback failures from breaking the expiration flow. Useslogger.exceptioncorrectly for full traceback capture.CLAUDE.md (2)
101-115: LGTM!Package structure documentation accurately reflects the new approval gate integration points: API approval gate integration, engine approval gate components (escalation detection, context parking/resume, models), and the
request_human_approvaltool.
154-154: LGTM!Event constant documentation properly extended with the new approval gate events (
APPROVAL_GATE_ESCALATION_DETECTED,APPROVAL_GATE_INITIALIZED,APPROVAL_GATE_RISK_CLASSIFIED,APPROVAL_GATE_RESUME_TRIGGERED), following the established pattern for domain-specific event modules.src/ai_company/api/dto.py (4)
10-17: LGTM!Clean import organization for Pydantic v2 validation decorators. The addition of
field_validatorandmodel_validatorenables the new action_type format and metadata bounds validation.
221-232: LGTM!The
_validate_action_type_formatvalidator correctly enforces thecategory:actionformat by checking for exactly two non-empty parts after splitting on:. The use ofstrip()handles whitespace-padded segments appropriately.
217-217: LGTM!Reasonable TTL bounds: minimum 60 seconds prevents excessively short-lived approvals, maximum 604800 seconds (7 days) prevents indefinite pending items. Aligns with the docstring update.
261-261: LGTM!Adding
max_length=4096to the optional comment field provides consistent bounds with other text fields (description,reason) and prevents unbounded input.src/ai_company/tools/__init__.py (2)
3-3: LGTM!Clean import of the new
RequestHumanApprovalToolfrom the approval_tool module.
68-68: LGTM!
RequestHumanApprovalToolcorrectly added to__all__in alphabetical order, maintaining the established convention for public API exports.tests/unit/api/controllers/test_approvals.py (1)
19-29: LGTM!Test payload correctly updated to use the new
category:actionformat ("code:merge"). The removal ofrequested_byfrom the payload aligns with the DTO change where this field is now derived from the authenticated user rather than the request body.README.md (1)
132-132: LGTM!The Status section accurately reflects the addition of approval workflow gates to the implemented features.
src/ai_company/api/state.py (1)
138-141: LGTM!The
approval_gateproperty correctly returns the optional service without raising 503, which is appropriate since approval gates are an optional feature. The pattern is consistent with the nullable return type.src/ai_company/security/timeout/parked_context.py (1)
41-43: LGTM!The
task_idfield change to optional withNonedefault properly supports taskless agents. The description update clarifies the use case.src/ai_company/engine/__init__.py (1)
9-10: LGTM!The new imports and
__all__entries properly expose the approval gate components as part of the public engine API.src/ai_company/engine/loop_helpers.py (2)
330-341: LGTM!The escalation check integrates cleanly after tool execution. The flow correctly:
- Checks if approval_gate is provided
- Queries pending escalations via
should_park- Delegates to
_park_for_approvalwhen escalation detected
346-419: LGTM!The
_park_for_approvalhelper is well-structured:
- Properly extracts agent_id and task_id from context
- Handles parking failures by returning ERROR with descriptive metadata
- Returns PARKED result with approval_id on success
- Re-raises MemoryError and RecursionError as required
tests/unit/engine/test_approval_gate_models.py (3)
1-9: LGTM!Test file follows guidelines with proper markers (
pytest.mark.unit) and timeout configuration.
12-88: LGTM!Comprehensive test coverage for
EscalationInfo:
- Valid construction with all fields
- Immutability enforcement via frozen model
- Parametrized tests for blank/empty string rejection
- All risk levels validated
90-133: LGTM!Good coverage for
ResumePayloadincluding approved/rejected paths and immutability checks.tests/unit/observability/test_events.py (2)
135-135: LGTM!The pytestmark format change to a list correctly combines the unit marker with the timeout.
178-178: LGTM!Adding
"approval_gate"to the expected domain modules ensures the new observability events module is discovered and validated.src/ai_company/engine/approval_gate_models.py (1)
1-51: LGTM!Well-structured frozen Pydantic models following project conventions:
- Proper use of
NotBlankStrfor all identifier fields- Frozen config for immutability
- Clear Google-style docstrings
- No logging required for pure data models per guidelines
tests/unit/tools/test_approval_tool.py (1)
59-256: Strong coverage for approval-tool execution and failure paths.This test suite exercises both happy paths and boundary/error behavior well (parking metadata, taskless mode, validation, and store failures).
src/ai_company/security/timeout/park_service.py (1)
43-53:task_idoptionality is integrated cleanly.The signature and docstring updates are consistent and preserve existing park/resume behavior for task-bound contexts.
tests/unit/engine/test_approval_gate.py (1)
188-291: Resume-path testing is thorough and well targeted.Good coverage of success, unknown approval, deserialization failure, and delete-failure behavior.
tests/unit/tools/test_invoker_escalation.py (1)
114-280: Escalation-tracking tests look solid.The cases for verdict-driven escalation, clearing between calls, and parking metadata are comprehensive and high-value.
src/ai_company/engine/react_loop.py (1)
65-70: Approval-gate wiring in ReAct loop is clean.The optional dependency is threaded correctly and keeps default behavior intact when no gate is configured.
Also applies to: 217-224
src/ai_company/engine/plan_execute_loop.py (1)
85-93: PlanExecute approval-gate plumbing is correctly applied.The instance-level gate injection and propagation into
execute_tool_calls()are consistent and low-risk.Also applies to: 733-758
src/ai_company/tools/approval_tool.py (1)
1-252: LGTM!The implementation follows coding guidelines correctly:
- Uses PEP 758 except syntax (
except MemoryError, RecursionError:)- Proper structured logging with event constants from
ai_company.observability.events.approval_gate- Google-style docstrings on public class and methods
- Type hints on all public functions
- Defensive error handling with fallback to
ApprovalRiskLevel.HIGHsrc/ai_company/engine/_security_factory.py (1)
1-148: LGTM!The factory functions are well-structured:
make_security_interceptorproperly validates configuration consistency and logs warnings/errors before raisingregistry_with_approval_toolcorrectly creates a new registry rather than mutating the existing one- Proper use of structured logging with event constants
src/ai_company/observability/events/approval_gate.py (1)
1-16: LGTM!Event constants module correctly follows the established patterns:
- Uses
typing.Final[str]for immutability- Dot-notation naming convention (
approval_gate.<category>.<action>)- Comprehensive coverage of approval gate lifecycle events
Based on learnings: "Event names must always use constants from domain-specific modules under
ai_company.observability.events" — this module provides those constants for the approval gate domain.src/ai_company/engine/approval_gate.py (1)
241-274: Good security design inbuild_resume_message.The prompt injection mitigation is well-implemented:
- Uses
repr()(!rformat specifier) to escape user-supplied values- Explicitly labels
decision_reasonas untrusted data with clear instructions not to follow as instructions- Structural separation between system decision signal and user content
src/ai_company/engine/agent_engine.py (3)
139-142: Initialization order is correct.The assignment order ensures dependencies are satisfied:
_approval_storeis stored first_parked_context_repois stored_approval_gateis created (depends on both above)_loopis created (depends on_approval_gate)
597-617: LGTM!The
_make_approval_gateand_make_default_loopmethods are well-structured:
- Returns
Nonewhen approval store is unavailable, allowing graceful degradation- Lazy import of
ParkServiceavoids circular importsReactLoopreceives the approval gate for escalation handling
631-655: LGTM!The
_make_tool_invokercorrectly integrates approval tooling:
- Uses
registry_with_approval_toolto augment the registry when approval store is available- Passes identity and task_id for proper context in the approval tool
- Creates security interceptor with effective_autonomy
src/ai_company/tools/invoker.py (4)
120-136: LGTM!The escalation state management follows the immutability guidelines correctly:
- Internal
_pending_escalationsis a mutable list for accumulation- Public
pending_escalationsproperty returns an immutable tuple- Cleared at the start of each
invoke()andinvoke_all()call
601-652: Good defensive design: re-raising asToolExecutionErroron tracking failure.The comment at lines 613-616 correctly explains the rationale — if escalation tracking fails, the caller must treat this as a tool execution failure to prevent silently bypassing the approval gate. This is the right security posture.
773-779: Deterministic escalation ordering is a good design choice.Sorting escalations by tool-call index ensures consistent behavior regardless of concurrent execution order, which aids debugging and testing.
633-635: The enum construction is correct; no issue exists.
ApprovalRiskLevelextendsStrEnumwith valuesLOW = "low",MEDIUM = "medium",HIGH = "high", andCRITICAL = "critical".StrEnumis designed for string-based construction, andApprovalRiskLevel("high")will successfully resolve to the enum member. The default fallback value"high"also matches a valid enum value.> Likely an incorrect or invalid review comment.
| async def _trigger_resume( | ||
| app_state: AppState, | ||
| approval_id: str, | ||
| *, | ||
| approved: bool, | ||
| decided_by: str, | ||
| decision_reason: str | None = None, | ||
| ) -> None: | ||
| """Signal that a parked agent can be resumed after a decision. | ||
|
|
||
| Does NOT consume the parked record — that is the responsibility | ||
| of the future scheduling component that will actually re-enqueue | ||
| the agent. This function only logs the resume trigger so that | ||
| a scheduler observing log events or polling the approval store | ||
| can detect the decision and call ``ApprovalGate.resume_context()``. | ||
|
|
||
| Args: | ||
| app_state: Application state containing the approval gate. | ||
| approval_id: The approval item identifier. | ||
| approved: Whether the action was approved. | ||
| decided_by: Who made the decision. | ||
| decision_reason: Optional reason for the decision. | ||
| """ | ||
| approval_gate = app_state.approval_gate | ||
| if approval_gate is None: | ||
| return | ||
|
|
||
| logger.info( | ||
| APPROVAL_GATE_RESUME_TRIGGERED, | ||
| approval_id=approval_id, | ||
| approved=approved, | ||
| decided_by=decided_by, | ||
| has_reason=decision_reason is not None, | ||
| ) | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether any runtime path consumes APPROVAL_GATE_RESUME_TRIGGERED
# or otherwise resumes parked contexts on approval decisions.
rg -n --type=py 'APPROVAL_GATE_RESUME_TRIGGERED'
rg -n --type=py '\bresume_context\s*\('
rg -n --type=py '\b_trigger_resume\s*\('Repository: Aureliolo/synthorg
Length of output: 1609
🏁 Script executed:
cat -n src/ai_company/api/controllers/approvals.py | head -80Repository: Aureliolo/synthorg
Length of output: 3083
🏁 Script executed:
cat -n src/ai_company/api/controllers/approvals.py | sed -n '400,500p'Repository: Aureliolo/synthorg
Length of output: 3707
🏁 Script executed:
rg -n --type=py 'event.*listener|subscribe|consume.*APPROVAL_GATE_RESUME' -iRepository: Aureliolo/synthorg
Length of output: 31713
🏁 Script executed:
fd -type f -name '*scheduler*' -o -name '*background*' -o -name '*worker*' -o -name '*poll*' | grep -E '\.py$'Repository: Aureliolo/synthorg
Length of output: 232
🏁 Script executed:
rg -n --type=py 'scheduler|background.*job|polling.*approval|event.*consumer' -iRepository: Aureliolo/synthorg
Length of output: 176
🏁 Script executed:
cat -n src/ai_company/api/controllers/approvals.py | sed -n '171,210p'Repository: Aureliolo/synthorg
Length of output: 1637
Resume mechanism incomplete—parked approvals are never resumed.
_trigger_resume() only logs APPROVAL_GATE_RESUME_TRIGGERED; no production code consumes this event or invokes ApprovalGate.resume_context(). The docstring acknowledges this as a future responsibility ("a scheduler observing log events or polling the approval store can detect the decision"), but that scheduler does not exist. Without a consumer, approvals remain parked indefinitely after decision.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/api/controllers/approvals.py` around lines 171 - 205, The
resume flow currently only logs APPPROVAL_GATE_RESUME_TRIGGERED and never wakes
parked approvals; update _trigger_resume to call the approval gate's resume
logic (use approval_gate.resume_context or the appropriate resume method on
AppState.approval_gate) after logging, passing approval_id, approved, decided_by
and decision_reason, and ensure the call is non-blocking (e.g. schedule via
asyncio.create_task or other background executor) and properly error-handled
with logger.info/logger.error on success/failure so parked contexts are actually
resumed without consuming the approval record.
Greptile SummaryThis PR implements the approval workflow gate feature, wiring SecOps ESCALATE verdicts and a new Key changes:
Two issues to address:
Confidence Score: 4/5
Important Files Changed
|
There was a problem hiding this comment.
Pull request overview
Adds an approval-gated execution path to the agent engine so SecOps ESCALATE verdicts and agent-initiated request_human_approval tool calls can park execution, persist context, and support resumption workflows.
Changes:
- Introduces
ApprovalGate+ models and wires parking checks intoReactLoop/PlanExecuteLoopvialoop_helpers.execute_tool_calls(). - Adds
RequestHumanApprovalTooland extendsToolInvokerto track pending escalations (SecOps verdicts + tool metadata). - Hardens approvals API (auth-bound
requested_by, conflict-safe saves, resume trigger logging) and fixes frontend WS handling to useapproval_id+ fetch full items.
Reviewed changes
Copilot reviewed 32 out of 32 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| web/src/stores/approvals.ts | Update WS approval event handling to use approval_id and fetch full item via API. |
| web/src/tests/stores/approvals.test.ts | Adds async WS handler tests and error-path coverage for approval WS updates. |
| tests/unit/tools/test_invoker_escalation.py | New unit tests for ToolInvoker escalation tracking and clearing behavior. |
| tests/unit/tools/test_approval_tool.py | New unit tests for RequestHumanApprovalTool creation, validation, and error handling. |
| tests/unit/observability/test_events.py | Updates event discovery expectations and standardizes pytest marks. |
| tests/unit/engine/test_loop_helpers_approval.py | New tests for approval gate integration in execute_tool_calls(). |
| tests/unit/engine/test_approval_gate_models.py | New tests for frozen Pydantic models EscalationInfo / ResumePayload. |
| tests/unit/engine/test_approval_gate.py | New tests for ApprovalGate park/resume behavior and resume message formatting. |
| tests/unit/api/test_dto.py | Updates DTO tests for new action_type format and removed requested_by. |
| tests/unit/api/controllers/test_approvals.py | Updates controller test payload defaults for category:action action_type. |
| src/ai_company/tools/registry.py | Adds all_tools() to support composing registries with injected tools. |
| src/ai_company/tools/invoker.py | Tracks pending_escalations from SecOps ESCALATE and tool parking metadata; deterministic ordering in invoke_all. |
| src/ai_company/tools/approval_tool.py | Implements request_human_approval tool that creates an ApprovalItem and signals parking via metadata. |
| src/ai_company/tools/init.py | Exposes RequestHumanApprovalTool from tools package. |
| src/ai_company/security/timeout/parked_context.py | Makes task_id optional on parked contexts (taskless agents). |
| src/ai_company/security/timeout/park_service.py | Allows task_id=None when parking contexts. |
| src/ai_company/observability/events/approval_gate.py | Adds structured event constants for approval gate lifecycle. |
| src/ai_company/engine/react_loop.py | Injects optional ApprovalGate into loop and passes through to tool execution helper. |
| src/ai_company/engine/plan_execute_loop.py | Injects optional ApprovalGate and routes through to tool execution helper. |
| src/ai_company/engine/loop_helpers.py | Adds approval-gate parking path after tool execution and returns TerminationReason.PARKED. |
| src/ai_company/engine/approval_gate_models.py | Adds frozen models for escalation + resume payload. |
| src/ai_company/engine/approval_gate.py | Implements park/persist and resume/load/delete orchestration plus resume message builder. |
| src/ai_company/engine/agent_engine.py | Wires approval gate + approval tool registry injection; factors security setup into _security_factory. |
| src/ai_company/engine/_security_factory.py | New factory helpers for security interceptor and approval-tool registry composition. |
| src/ai_company/engine/init.py | Re-exports approval gate types as part of engine public API. |
| src/ai_company/api/state.py | Stores optional ApprovalGate in AppState for controller access/logging. |
| src/ai_company/api/dto.py | Hardens approval DTOs (action_type format validator; TTL max; approval comment max length). |
| src/ai_company/api/controllers/approvals.py | Binds requested_by to authenticated user, adds conflict-safe decision saving and resume-trigger logging, broadens WS publish exception handling. |
| src/ai_company/api/approval_store.py | Adds save_if_pending() and guards on_expire callback failures. |
| docs/design/engine.md | Documents approval gate parking path in the engine run flow. |
| README.md | Updates status to include approval workflow gates as implemented. |
| CLAUDE.md | Updates repo structure notes and observability event guidance to include approval-gate domain. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const originalIsAxiosError = (await import('axios')).default.isAxiosError | ||
| vi.spyOn((await import('axios')).default, 'isAxiosError').mockImplementation((err) => { | ||
| if (err === axiosError) return true | ||
| return originalIsAxiosError(err) | ||
| }) | ||
|
|
||
| mockGetApproval.mockRejectedValue(axiosError) | ||
|
|
||
| const store = useApprovalStore() | ||
| store.approvals = [mockApproval] | ||
| store.total = 5 | ||
|
|
||
| const event: WsEvent = { | ||
| event_type: 'approval.approved', | ||
| channel: 'approvals', | ||
| timestamp: '2026-03-12T10:01:00Z', | ||
| payload: { approval_id: 'approval-1', status: 'approved', action_type: 'deploy:production', risk_level: 'high' }, | ||
| } | ||
| store.handleWsEvent(event) | ||
| await flushPromises() | ||
|
|
||
| // Item was found and removed, so total decremented | ||
| expect(store.approvals).toHaveLength(0) | ||
| expect(store.total).toBe(4) | ||
| }) | ||
|
|
||
| it('does not decrement total when item was not in local list', async () => { | ||
| const axiosError = new Error('Not found') as Error & { isAxiosError: boolean; response: { status: number } } | ||
| axiosError.isAxiosError = true | ||
| axiosError.response = { status: 404 } | ||
| vi.spyOn((await import('axios')).default, 'isAxiosError').mockImplementation((err) => { | ||
| if (err === axiosError) return true | ||
| return false | ||
| }) | ||
|
|
| task_id: NotBlankStr | None = Field( | ||
| default=None, description="Task identifier (None for taskless agents)" | ||
| ) |
| logger.debug( | ||
| APPROVAL_GATE_CONTEXT_PARKED, | ||
| approval_id=escalation.approval_id, | ||
| agent_id=agent_id, | ||
| note="No task_execution on context — task_id will be None", | ||
| ) |
web/src/stores/approvals.ts
Outdated
| const item = await approvalsApi.getApproval(approvalId) | ||
| if (activeFilters.value) { | ||
| // Filters are active — item may not match; just bump total | ||
| total.value++ | ||
| } else { | ||
| approvals.value = [item, ...approvals.value] | ||
| total.value++ | ||
| } |
There was a problem hiding this comment.
Code Review
This is an excellent pull request that implements a critical approval workflow feature. The implementation is robust, secure, and well-tested across both backend and frontend. The code is well-designed with good separation of concerns, and includes important security hardening and resilience improvements. The refactoring of AgentEngine and the frontend approvals store are particularly noteworthy improvements. I have one suggestion to improve maintainability by reducing code duplication.
| parts = action_type.split(":") | ||
| if len(parts) != 2 or not parts[0].strip() or not parts[1].strip(): # noqa: PLR2004 |
There was a problem hiding this comment.
This validation logic for action_type is duplicated from src/ai_company/api/dto.py. While the implementation is slightly different (one raises ValueError for Pydantic, this one returns a ToolExecutionResult), the core logic is the same. This could lead to inconsistencies if the validation rules change in the future.
To improve maintainability, consider creating a shared utility function, for example in a new src/ai_company/core/utils.py module, that performs this validation and returns a boolean. Both the DTO and this tool could then use this shared function and handle the result as needed.
- Rename _trigger_resume → _signal_resume_intent with TODO for scheduler - Add warning log before UnauthorizedError in controller auth checks - Add DEBUG log on successful parked record deletion - New APPROVAL_GATE_PARK_TASKLESS event for pre-park debug log - New APPROVAL_GATE_RISK_CLASSIFY_FAILED event for classification failure - Fix SQLite schema: parked_contexts.task_id TEXT NOT NULL → TEXT (nullable) - Extract shared is_valid_action_type() to core/validation.py - Fix filtered WS handler: re-fetch query instead of blind total increment - Add afterEach(vi.restoreAllMocks()) for spy cleanup in frontend tests - Add 10 missing event constants to CLAUDE.md reference list - Fix test-model-001 → test-small-001 per project conventions - Add comprehensive tests: validation, approval_store, controller helpers, security_factory, approval_gate init/resume, risk classification failure, action_type format validation, TTL bounds, ApproveRequest comment length, loop_helpers taskless path and IOError handling
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/ai_company/persistence/sqlite/migrations.py (1)
148-163:⚠️ Potential issue | 🔴 CriticalSchema change in existing migration won't apply to upgraded databases.
Modifying
_V3_STATEMENTSto maketask_idnullable has no effect on databases already at version 3+. TheCREATE TABLE IF NOT EXISTSguard skips DDL when the table exists, so existing installs retain theNOT NULLconstraint. This causes INSERT failures whenParkService.parkis called withtask_id=None.Add a V6 migration to alter the existing column. SQLite doesn't support
ALTER COLUMN, so use table recreation:Proposed V6 migration to fix existing databases
+_V6_STATEMENTS: Sequence[str] = ( + # ── Make parked_contexts.task_id nullable ───────────── + """\ +CREATE TABLE IF NOT EXISTS parked_contexts_new ( + id TEXT PRIMARY KEY, + execution_id TEXT NOT NULL, + agent_id TEXT NOT NULL, + task_id TEXT, + approval_id TEXT NOT NULL, + parked_at TEXT NOT NULL, + context_json TEXT NOT NULL, + metadata TEXT NOT NULL DEFAULT '{}' +)""", + "INSERT INTO parked_contexts_new SELECT * FROM parked_contexts", + "DROP TABLE parked_contexts", + "ALTER TABLE parked_contexts_new RENAME TO parked_contexts", + "CREATE INDEX IF NOT EXISTS idx_pc_agent_id ON parked_contexts(agent_id)", + "CREATE INDEX IF NOT EXISTS idx_pc_approval_id ON parked_contexts(approval_id)", +) + + +async def _apply_v6(db: aiosqlite.Connection) -> None: + """Apply schema v6: make parked_contexts.task_id nullable.""" + for stmt in _V6_STATEMENTS: + await db.execute(stmt)Also update:
-SCHEMA_VERSION = 5 +SCHEMA_VERSION = 6 _MIGRATIONS: list[tuple[int, _MigrateFn]] = [ (1, _apply_v1), (2, _apply_v2), (3, _apply_v3), (4, _apply_v4), (5, _apply_v5), + (6, _apply_v6), ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/persistence/sqlite/migrations.py` around lines 148 - 163, The schema change in _V3_STATEMENTS making parked_contexts.task_id nullable will not affect existing DBs because CREATE TABLE IF NOT EXISTS is skipped; add a new migration version (V6) that recreates the parked_contexts table with task_id NULLABLE, copies data across, drops the old table and re-creates the indexes so upgraded DBs get the new schema; ensure the new migration is appended to the migrations sequence/registry used by the migrator and reference the parked_contexts table and ParkService.park behavior so INSERTs with task_id=None succeed.tests/unit/api/test_approval_store.py (1)
49-50: 🛠️ Refactor suggestion | 🟠 MajorAdd the required 30s timeout marker for this test module.
These classes are marked
unit, but the timeout requirement is not applied here. Add a module-levelpytestmark(or class-level timeout marks) to enforce per-test time limits.Suggested fix
import pytest from ai_company.api.approval_store import ApprovalStore from ai_company.api.errors import ConflictError from ai_company.core.approval import ApprovalItem from ai_company.core.enums import ApprovalRiskLevel, ApprovalStatus +pytestmark = [pytest.mark.unit, pytest.mark.timeout(30)] + @@ -@pytest.mark.unit class TestApprovalStore: @@ -@pytest.mark.unit class TestSaveIfPending: @@ -@pytest.mark.unit class TestApprovalStoreFilters: @@ -@pytest.mark.unit class TestOnExpireCallback:As per coding guidelines, "Test timeout: 30 seconds per test."
Also applies to: 179-180, 242-243, 304-305
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/api/test_approval_store.py` around lines 49 - 50, Add a module-level pytest mark that enforces a 30s per-test timeout so every test in this file runs with the required limit; declare a pytestmark combining the existing unit mark with a timeout (e.g., pytestmark = [pytest.mark.unit, pytest.mark.timeout(30)]) near the top of tests/unit/api/test_approval_store.py (before the TestApprovalStore class and the other test classes in the file) so all classes like TestApprovalStore inherit the 30s timeout.
♻️ Duplicate comments (1)
src/ai_company/api/controllers/approvals.py (1)
177-213:⚠️ Potential issue | 🟠 MajorDecisions are still not wired to an actual resume path.
_signal_resume_intent()only logsAPPROVAL_GATE_RESUME_TRIGGEREDand returns. With approve/reject now depending on this helper, parked contexts can remain indefinitely parked unless another component performs resume immediately.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/api/controllers/approvals.py` around lines 177 - 213, _signal_resume_intent currently only logs APPROVAL_GATE_RESUME_TRIGGERED and never triggers a resume, leaving parked contexts indefinitely; update _signal_resume_intent to call into the approval gate (use app_state.approval_gate) to perform the actual resume—e.g., await approval_gate.resume_context(approval_id, approved=approved, decided_by=decided_by, decision_reason=decision_reason) or schedule it as a background task if non-blocking behavior is required—and add try/except to log success or failure (including error details) so the resume path is executed instead of relying solely on the log event.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/engine/approval_gate.py`:
- Around line 213-247: The resume_context() flow currently returns (context,
parked.id) even when self._parked_context_repo.delete(parked.id) raises or
returns False, which lets retries replay the same parked execution; change the
logic in resume_context so that after attempting delete (the call to
self._parked_context_repo.delete in resume_context) you only return (context,
parked.id) if delete succeeded (True); if delete raised an exception (other than
critical exceptions) or returned False then raise a specific exception (e.g.,
RuntimeError or a custom DeleteParkedContextError) including approval_id and
parked.id so callers know cleanup failed and won’t resume the same parked
record; also fix the exception clause around MemoryError/RecursionError to use
the tuple form (except (MemoryError, RecursionError):) so those are re-raised
correctly.
- Around line 92-160: The park_context and resume_context functions are too
large and mix serialization, persistence, logging, and recovery; split them into
small helpers: extract the serialization and initial logging that calls
self._park_service.park into a helper (e.g., _serialize_and_log_parked_context)
that returns the ParkedContext and preserves the current try/except behavior
(re-raising MemoryError/RecursionError, logging
APPROVAL_GATE_CONTEXT_PARK_FAILED on other Exceptions), extract the persistence
block that awaits self._parked_context_repo.save into a separate helper (e.g.,
_persist_parked_context) that also re-raises MemoryError/RecursionError and logs
persistence failures, and do the analogous split for resume_context around
self._park_service.unpark/self._parked_context_repo.* operations; keep logger
calls, APPROVAL_GATE_* constants, and parameter passing (escalation, agent_id,
task_id, parked.id) unchanged so callers of park_context/resume_context become
thin orchestrators that call the new helpers.
In `@src/ai_company/tools/approval_tool.py`:
- Around line 114-116: The execute() implementation currently accesses
arguments["action_type"], ["title"], and ["description"] directly which can
raise KeyError and bypass returning a ToolExecutionResult; update
ApprovalTool.execute to validate the arguments dict at the method boundary
(check presence and types of "action_type", "title", "description") and on any
missing/invalid value return a structured ToolExecutionResult error (or raise
the tool's expected exception) instead of indexing blindly; use
arguments.get(...) with explicit checks or an input validation helper, and
reference the existing execute(), action_type, title, and description symbols
when making the change.
In `@web/src/stores/approvals.ts`:
- Around line 77-85: The duplicate-prevention check using
approvals.value.some(...) runs before the async approvalsApi.getApproval call,
allowing a TOCTOU race if two approval.submitted events for the same approvalId
arrive; after awaiting getApproval(approvalId) re-check approvals.value for an
existing id and only prepend the fetched item and increment total.value if it
still does not exist (or bail out), referencing the approvals.value array,
approvalId variable, approvalsApi.getApproval method,
fetchApprovals(activeFilters.value) fallback, and total.value counter to locate
and update the logic.
---
Outside diff comments:
In `@src/ai_company/persistence/sqlite/migrations.py`:
- Around line 148-163: The schema change in _V3_STATEMENTS making
parked_contexts.task_id nullable will not affect existing DBs because CREATE
TABLE IF NOT EXISTS is skipped; add a new migration version (V6) that recreates
the parked_contexts table with task_id NULLABLE, copies data across, drops the
old table and re-creates the indexes so upgraded DBs get the new schema; ensure
the new migration is appended to the migrations sequence/registry used by the
migrator and reference the parked_contexts table and ParkService.park behavior
so INSERTs with task_id=None succeed.
In `@tests/unit/api/test_approval_store.py`:
- Around line 49-50: Add a module-level pytest mark that enforces a 30s per-test
timeout so every test in this file runs with the required limit; declare a
pytestmark combining the existing unit mark with a timeout (e.g., pytestmark =
[pytest.mark.unit, pytest.mark.timeout(30)]) near the top of
tests/unit/api/test_approval_store.py (before the TestApprovalStore class and
the other test classes in the file) so all classes like TestApprovalStore
inherit the 30s timeout.
---
Duplicate comments:
In `@src/ai_company/api/controllers/approvals.py`:
- Around line 177-213: _signal_resume_intent currently only logs
APPROVAL_GATE_RESUME_TRIGGERED and never triggers a resume, leaving parked
contexts indefinitely; update _signal_resume_intent to call into the approval
gate (use app_state.approval_gate) to perform the actual resume—e.g., await
approval_gate.resume_context(approval_id, approved=approved,
decided_by=decided_by, decision_reason=decision_reason) or schedule it as a
background task if non-blocking behavior is required—and add try/except to log
success or failure (including error details) so the resume path is executed
instead of relying solely on the log event.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 764fa9b3-8db0-4b16-b082-79aa884f4a96
📒 Files selected for processing (19)
CLAUDE.mdsrc/ai_company/api/controllers/approvals.pysrc/ai_company/api/dto.pysrc/ai_company/core/validation.pysrc/ai_company/engine/approval_gate.pysrc/ai_company/engine/loop_helpers.pysrc/ai_company/observability/events/approval_gate.pysrc/ai_company/persistence/sqlite/migrations.pysrc/ai_company/tools/approval_tool.pytests/unit/api/controllers/test_approvals_helpers.pytests/unit/api/test_approval_store.pytests/unit/api/test_dto.pytests/unit/core/test_validation.pytests/unit/engine/test_approval_gate.pytests/unit/engine/test_loop_helpers_approval.pytests/unit/engine/test_security_factory.pytests/unit/tools/test_approval_tool.pyweb/src/__tests__/stores/approvals.test.tsweb/src/stores/approvals.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Greptile Review
- GitHub Check: Test (Python 3.14)
🧰 Additional context used
📓 Path-based instructions (7)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Python version 3.14+ with PEP 649 native lazy annotations required.
Do NOT usefrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax:except A, B:(no parentheses) — ruff enforces this on Python 3.14.
Line length must be 88 characters, enforced by ruff.
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:example-provider,example-large-001,example-medium-001,example-small-001. Tests must usetest-provider,test-small-001, etc.
Files:
tests/unit/core/test_validation.pytests/unit/api/controllers/test_approvals_helpers.pytests/unit/engine/test_approval_gate.pytests/unit/engine/test_security_factory.pysrc/ai_company/api/controllers/approvals.pysrc/ai_company/tools/approval_tool.pysrc/ai_company/persistence/sqlite/migrations.pytests/unit/tools/test_approval_tool.pysrc/ai_company/engine/loop_helpers.pytests/unit/engine/test_loop_helpers_approval.pytests/unit/api/test_approval_store.pysrc/ai_company/observability/events/approval_gate.pytests/unit/api/test_dto.pysrc/ai_company/engine/approval_gate.pysrc/ai_company/core/validation.pysrc/ai_company/api/dto.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Tests must use markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Test coverage minimum: 80% (enforced in CI).
Async tests: useasyncio_mode = "auto"— no manual@pytest.mark.asyncioneeded.
Test timeout: 30 seconds per test.
Prefer@pytest.mark.parametrizefor testing similar cases.
Files:
tests/unit/core/test_validation.pytests/unit/api/controllers/test_approvals_helpers.pytests/unit/engine/test_approval_gate.pytests/unit/engine/test_security_factory.pytests/unit/tools/test_approval_tool.pytests/unit/engine/test_loop_helpers_approval.pytests/unit/api/test_approval_store.pytests/unit/api/test_dto.py
web/src/**/*.{vue,ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Web dashboard: Vue 3 + PrimeVue + Tailwind CSS, organized by feature in src/components/, src/stores/, src/views/. Enforce with ESLint and vue-tsc type-checking.
Files:
web/src/stores/approvals.tsweb/src/__tests__/stores/approvals.test.ts
web/src/stores/**/*.ts
📄 CodeRabbit inference engine (CLAUDE.md)
Frontend state management: Pinia stores organized by feature in src/stores/
Files:
web/src/stores/approvals.ts
web/src/__tests__/**/*.ts
📄 CodeRabbit inference engine (CLAUDE.md)
Web dashboard tests: Vitest unit tests organized by feature in tests/
Files:
web/src/__tests__/stores/approvals.test.ts
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: All public functions must have type hints. Enforce via mypy strict mode.
Google-style docstrings required on public classes and functions, enforced by ruff D rules.
Create new objects instead of mutating existing ones. For non-Pydantic internal collections, usecopy.deepcopy()at construction plusMappingProxyTypewrapping for read-only enforcement.
Fordict/listfields in frozen Pydantic models, rely onfrozen=Truefor field reassignment prevention and usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 withBaseModel,model_validator,computed_field,ConfigDict. Use@computed_fieldfor derived values instead of storing redundant fields (e.g.,TokenUsage.total_tokens).
UseNotBlankStrfromcore.typesfor all identifier/name fields in Pydantic models, including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants, instead of manual whitespace validators.
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task.
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly, never silently swallow exceptions.
Validate at system boundaries: user input, external APIs, config files.
NEVER useimport logging,logging.getLogger(), orprint()in application code.
Always useloggeras the variable name for loggers (not_logger, notlog).
Event names must always use constants from domain-specific modules underai_company.observability.events(e.g.,PROVIDER_CALL_STARTfromevents.provider,BUDGET_RECORD_ADDEDfromevents.budget). Import dire...
Files:
src/ai_company/api/controllers/approvals.pysrc/ai_company/tools/approval_tool.pysrc/ai_company/persistence/sqlite/migrations.pysrc/ai_company/engine/loop_helpers.pysrc/ai_company/observability/events/approval_gate.pysrc/ai_company/engine/approval_gate.pysrc/ai_company/core/validation.pysrc/ai_company/api/dto.py
src/ai_company/**/[!_]*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Every module with business logic MUST have:
from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Files:
src/ai_company/api/controllers/approvals.pysrc/ai_company/tools/approval_tool.pysrc/ai_company/persistence/sqlite/migrations.pysrc/ai_company/engine/loop_helpers.pysrc/ai_company/observability/events/approval_gate.pysrc/ai_company/engine/approval_gate.pysrc/ai_company/core/validation.pysrc/ai_company/api/dto.py
🧠 Learnings (14)
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to docs/design/**/*.md : When approved deviations occur, update the relevant `docs/design/` page to reflect the new reality.
Applied to files:
web/src/stores/approvals.ts
📚 Learning: 2026-03-13T21:03:58.907Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.907Z
Learning: Applies to web/src/stores/**/*.ts : Frontend state management: Pinia stores organized by feature in src/stores/
Applied to files:
web/src/stores/approvals.tsweb/src/__tests__/stores/approvals.test.ts
📚 Learning: 2026-03-13T21:03:58.907Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.907Z
Learning: Applies to web/src/__tests__/**/*.ts : Web dashboard tests: Vitest unit tests organized by feature in __tests__/
Applied to files:
web/src/__tests__/stores/approvals.test.ts
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
src/ai_company/api/controllers/approvals.pyCLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/ai_company/**/[!_]*.py : Every module with business logic MUST have: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`
Applied to files:
src/ai_company/api/controllers/approvals.pyCLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Event names must always use constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
CLAUDE.mdsrc/ai_company/engine/loop_helpers.pysrc/ai_company/observability/events/approval_gate.py
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Structured logging: always use `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : All state transitions must log at INFO level.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : NEVER use `import logging`, `logging.getLogger()`, or `print()` in application code.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Always use `logger` as the variable name for loggers (not `_logger`, not `log`).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : DEBUG logging for object creation, internal flow, and entry/exit of key functions.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Pure data models, enums, and re-exports do NOT need logging.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to **/*.py : NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: `example-provider`, `example-large-001`, `example-medium-001`, `example-small-001`. Tests must use `test-provider`, `test-small-001`, etc.
Applied to files:
tests/unit/engine/test_loop_helpers_approval.py
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Use Pydantic v2 with `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. Use `computed_field` for derived values instead of storing redundant fields (e.g., `TokenUsage.total_tokens`).
Applied to files:
src/ai_company/api/dto.py
🧬 Code graph analysis (13)
tests/unit/core/test_validation.py (1)
src/ai_company/core/validation.py (1)
is_valid_action_type(6-19)
web/src/stores/approvals.ts (1)
web/src/api/types.ts (1)
WsEvent(519-524)
web/src/__tests__/stores/approvals.test.ts (2)
web/src/stores/approvals.ts (1)
useApprovalStore(8-141)web/src/api/types.ts (1)
WsEvent(519-524)
tests/unit/engine/test_approval_gate.py (4)
src/ai_company/engine/approval_gate.py (5)
ApprovalGate(40-281)should_park(71-90)park_context(92-160)resume_context(162-246)build_resume_message(249-281)src/ai_company/engine/approval_gate_models.py (1)
EscalationInfo(14-33)src/ai_company/persistence/repositories.py (1)
ParkedContextRepository(199-267)src/ai_company/security/timeout/park_service.py (3)
ParkService(29-144)park(37-106)resume(108-144)
tests/unit/engine/test_security_factory.py (4)
src/ai_company/engine/_security_factory.py (2)
make_security_interceptor(39-117)registry_with_approval_tool(120-148)src/ai_company/engine/errors.py (1)
ExecutionStateError(17-18)src/ai_company/security/config.py (2)
RuleEngineConfig(73-90)SecurityConfig(93-140)src/ai_company/tools/registry.py (1)
all_tools(102-104)
src/ai_company/tools/approval_tool.py (5)
src/ai_company/core/enums.py (1)
ToolCategory(294-308)src/ai_company/core/validation.py (1)
is_valid_action_type(6-19)src/ai_company/tools/base.py (6)
BaseTool(57-184)ToolExecutionResult(25-54)description(138-140)category(128-130)action_type(133-135)parameters_schema(143-151)src/ai_company/api/approval_store.py (2)
ApprovalStore(27-181)add(42-59)src/ai_company/security/timeout/risk_tier_classifier.py (1)
DefaultRiskTierClassifier(62-101)
tests/unit/tools/test_approval_tool.py (3)
src/ai_company/api/approval_store.py (3)
ApprovalStore(27-181)get(61-73)add(42-59)src/ai_company/security/timeout/risk_tier_classifier.py (1)
DefaultRiskTierClassifier(62-101)src/ai_company/tools/approval_tool.py (2)
RequestHumanApprovalTool(34-253)execute(99-135)
src/ai_company/engine/loop_helpers.py (2)
src/ai_company/engine/approval_gate.py (3)
ApprovalGate(40-281)should_park(71-90)park_context(92-160)src/ai_company/engine/approval_gate_models.py (1)
EscalationInfo(14-33)
tests/unit/api/test_approval_store.py (4)
src/ai_company/api/approval_store.py (5)
ApprovalStore(27-181)add(42-59)save_if_pending(125-142)list_items(75-104)get(61-73)src/ai_company/core/enums.py (2)
ApprovalStatus(434-440)ApprovalRiskLevel(443-449)tests/unit/core/test_approval.py (1)
_now(11-12)src/ai_company/core/approval.py (1)
ApprovalItem(24-96)
tests/unit/api/test_dto.py (2)
src/ai_company/api/dto.py (3)
ApiResponse(39-57)ApproveRequest(247-256)CreateApprovalRequest(196-244)src/ai_company/core/enums.py (1)
ApprovalRiskLevel(443-449)
src/ai_company/engine/approval_gate.py (7)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/persistence/repositories.py (1)
ParkedContextRepository(199-267)src/ai_company/security/timeout/park_service.py (3)
ParkService(29-144)park(37-106)resume(108-144)src/ai_company/security/timeout/parked_context.py (1)
ParkedContext(19-66)src/ai_company/engine/approval_gate_models.py (1)
EscalationInfo(14-33)src/ai_company/engine/context.py (1)
AgentContext(87-307)src/ai_company/memory/errors.py (1)
MemoryError(13-14)
src/ai_company/core/validation.py (1)
src/ai_company/tools/base.py (1)
action_type(133-135)
src/ai_company/api/dto.py (1)
src/ai_company/core/validation.py (1)
is_valid_action_type(6-19)
🪛 LanguageTool
CLAUDE.md
[style] ~154-~154: A comma is missing here.
Context: ...nder ai_company.observability.events (e.g. PROVIDER_CALL_START from `events.prov...
(EG_NO_COMMA)
🔇 Additional comments (18)
web/src/stores/approvals.ts (3)
3-3: LGTM!Appropriate import of axios for
isAxiosError()type guard used in the error handling branches.
56-72: LGTM!Clear documentation of the synchronous contract with async IIFE pattern. The payload validation defensively handles edge cases, and the
approval_idtype narrowing is correctly implemented.
73-127: Past review concerns properly addressed.The refactored handler now correctly reconciles filtered views by calling
fetchApprovals(activeFilters.value)when filters are active (lines 78-80 and 99-101), rather than unconditionally mutating local state. The 404/410 handling and removal logic are also sound.CLAUDE.md (1)
1-242: LGTM!Documentation updates accurately reflect the new approval gate components, event constants, and integration points. The expanded event names section comprehensively covers the approval gate lifecycle events.
src/ai_company/engine/loop_helpers.py (3)
11-14: LGTM!Event constants are correctly imported from the domain-specific module and the new
APPROVAL_GATE_PARK_TASKLESSconstant semantically matches the debug log at line 373-378, addressing the concern about using a misleading event constant.
241-342: LGTM!The approval gate integration is well-structured:
- Keyword-only
approval_gateparameter prevents accidental positional usage- Escalation check happens after tool results are appended to context
- Clear delegation to
_park_for_approvalhelper
346-419: LGTM!The
_park_for_approvalhelper correctly:
- Extracts
task_idsafely withNonefallback for taskless agents- Logs appropriately with
APPROVAL_GATE_PARK_TASKLESSfor the taskless case- Returns
ERRORtermination with descriptive metadata when parking fails- Includes
parking_failedflag in metadata for downstream handlingsrc/ai_company/core/validation.py (1)
1-19: LGTM!Clean, focused validation utility with proper docstring and type hints. The implementation correctly handles edge cases like whitespace-only segments.
tests/unit/core/test_validation.py (1)
1-41: LGTM!Comprehensive parameterized test coverage for both valid and invalid
action_typeformats. Edge cases like whitespace-only segments, empty strings, and multiple colons are well covered.tests/unit/api/test_dto.py (1)
1-175: LGTM!Comprehensive test coverage for the updated DTOs:
action_typeformat validation with parameterized valid/invalid cases- TTL bounds testing at boundary conditions
ApproveRequest.commentoptional/length constraintssrc/ai_company/api/dto.py (3)
10-27: LGTM!Clean import organization with
field_validatorfrom Pydantic v2 and the sharedis_valid_action_typeutility from core validation.
196-244: LGTM!Well-structured
CreateApprovalRequestwith:
action_typeformat validation via@field_validator- TTL bounds (60s min, 7 days max) via
Fieldconstraints- Metadata bounds validation via
@model_validator(mode="after")
247-257: LGTM!
ApproveRequestcorrectly usesNotBlankStr | Nonewithmax_length=4096for the optional comment field, consistent with other description-length fields in the codebase.tests/unit/engine/test_approval_gate.py (1)
1-369: LGTM!Excellent test coverage for the
ApprovalGateservice:
should_parkbehavior for empty/non-empty escalationspark_contextserialization and persistence paths including error handlingresume_contextload/delete/error scenarios with resilience testingbuild_resume_messageformatting including prompt injection mitigation viarepr()escaping- Initialization warning behavior when no repository is provided
tests/unit/engine/test_loop_helpers_approval.py (2)
36-43: LGTM!The model identifier correctly uses
test-small-001following the repository's standardized test naming pattern.
67-273: LGTM!Comprehensive test coverage for approval gate integration in
execute_tool_calls:
- No-gate path returns context normally
- Gate with no escalation continues execution
- Escalation triggers
PARKEDresult with metadata- Parking failures return
ERRORwithparking_failedflag- Taskless agents park with
task_id=None- IO errors are handled gracefully
tests/unit/tools/test_approval_tool.py (2)
153-177: Parameterize the classifier matrix.These two tests only vary by
action_typeand expectedrisk_level. Folding them into one parametrized case will keep the classifier expectations aligned and easier to extend.[ suggest_recommended_refactor ]
As per coding guidelines, Prefer `@pytest.mark.parametrize` for testing similar cases.♻️ Refactor sketch
class TestRiskClassification: """Risk classification with and without classifier.""" - async def test_with_classifier_uses_known_action( - self, - tool_with_classifier: RequestHumanApprovalTool, - ) -> None: - result = await tool_with_classifier.execute( - arguments={ - "action_type": "deploy:production", - "title": "Deploy v2.0", - "description": "Full deployment", - }, - ) - assert result.metadata["risk_level"] == "critical" - - async def test_with_classifier_unknown_defaults_to_high( + `@pytest.mark.parametrize`( + ("action_type", "title", "description", "expected_risk"), + [ + ( + "deploy:production", + "Deploy v2.0", + "Full deployment", + ApprovalRiskLevel.CRITICAL.value, + ), + ( + "custom:unknown", + "Custom action", + "Unknown action type", + ApprovalRiskLevel.HIGH.value, + ), + ], + ) + async def test_with_classifier_risk_level( self, tool_with_classifier: RequestHumanApprovalTool, + action_type: str, + title: str, + description: str, + expected_risk: str, ) -> None: result = await tool_with_classifier.execute( arguments={ - "action_type": "custom:unknown", - "title": "Custom action", - "description": "Unknown action type", + "action_type": action_type, + "title": title, + "description": description, }, ) - assert result.metadata["risk_level"] == "high" + assert result.metadata["risk_level"] == expected_risk
183-208: Nice validation matrix.The parametrized invalid
action_typecases keep the negative-path coverage compact and easy to extend.As per coding guidelines, Prefer
@pytest.mark.parametrizefor testing similar cases.
| deleted = False | ||
| try: | ||
| deleted = await self._parked_context_repo.delete(parked.id) | ||
| except MemoryError, RecursionError: | ||
| raise | ||
| except Exception: | ||
| logger.exception( | ||
| APPROVAL_GATE_RESUME_DELETE_FAILED, | ||
| approval_id=approval_id, | ||
| parked_id=parked.id, | ||
| note="Context resumed but parked record not cleaned up", | ||
| ) | ||
|
|
||
| if not deleted: | ||
| logger.warning( | ||
| APPROVAL_GATE_RESUME_DELETE_FAILED, | ||
| approval_id=approval_id, | ||
| parked_id=parked.id, | ||
| note="delete() returned False — parked record may still exist", | ||
| ) | ||
| else: | ||
| logger.debug( | ||
| APPROVAL_GATE_CONTEXT_RESUMED, | ||
| approval_id=approval_id, | ||
| parked_id=parked.id, | ||
| note="Parked record successfully deleted", | ||
| ) | ||
|
|
||
| logger.info( | ||
| APPROVAL_GATE_CONTEXT_RESUMED, | ||
| approval_id=approval_id, | ||
| parked_id=parked.id, | ||
| ) | ||
| return context, parked.id | ||
|
|
There was a problem hiding this comment.
resume_context() can replay the same parked execution when cleanup fails.
When delete() fails or returns False, the method still returns (context, parked.id). A later retry can load the same parked record again and resume duplicate work.
Suggested fix
deleted = False
try:
deleted = await self._parked_context_repo.delete(parked.id)
except MemoryError, RecursionError:
raise
except Exception:
logger.exception(
APPROVAL_GATE_RESUME_DELETE_FAILED,
approval_id=approval_id,
parked_id=parked.id,
note="Context resumed but parked record not cleaned up",
)
+ raise
if not deleted:
- logger.warning(
+ logger.error(
APPROVAL_GATE_RESUME_DELETE_FAILED,
approval_id=approval_id,
parked_id=parked.id,
note="delete() returned False — parked record may still exist",
)
+ msg = "Failed to delete parked context; refusing resume to avoid replay"
+ raise RuntimeError(msg)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/engine/approval_gate.py` around lines 213 - 247, The
resume_context() flow currently returns (context, parked.id) even when
self._parked_context_repo.delete(parked.id) raises or returns False, which lets
retries replay the same parked execution; change the logic in resume_context so
that after attempting delete (the call to self._parked_context_repo.delete in
resume_context) you only return (context, parked.id) if delete succeeded (True);
if delete raised an exception (other than critical exceptions) or returned False
then raise a specific exception (e.g., RuntimeError or a custom
DeleteParkedContextError) including approval_id and parked.id so callers know
cleanup failed and won’t resume the same parked record; also fix the exception
clause around MemoryError/RecursionError to use the tuple form (except
(MemoryError, RecursionError):) so those are re-raised correctly.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #368 +/- ##
==========================================
+ Coverage 93.90% 94.00% +0.09%
==========================================
Files 447 453 +6
Lines 20819 21117 +298
Branches 2011 2038 +27
==========================================
+ Hits 19551 19851 +300
+ Misses 981 975 -6
- Partials 287 291 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- Split park_context/resume_context into smaller helpers (<50 lines each) - Fix duplicate APPROVAL_GATE_CONTEXT_RESUMED emission on success path - Add V6 migration to make parked_contexts.task_id nullable on existing DBs - Warn when external execution_loop bypasses approval gate wiring - Guard execute() against KeyError on missing arguments - Use APPROVAL_GATE_RISK_CLASSIFY_FAILED for "no classifier" debug log - Fix TOCTOU race on WS duplicate prevention with post-fetch re-check
There was a problem hiding this comment.
Pull request overview
Adds an approval-gating mechanism to the agent execution lifecycle so SecOps ESCALATE verdicts and agent-initiated request_human_approval tool calls can park execution, persist context, and later resume with a decision message injected into the conversation.
Changes:
- Introduces
ApprovalGate, escalation/resume models, andRequestHumanApprovalTool, and wires escalation detection intoToolInvokerplus both ReAct and Plan/Execute loops. - Hardens approvals API flows (auth-bound
requested_by, optimistic concurrency viasave_if_pending, resume-intent signalling, and broader WS publishing error handling) and updates DTO validation. - Fixes frontend WebSocket approval events by consuming
approval_idand fetching full items via API, with expanded unit tests and observability event constants.
Reviewed changes
Copilot reviewed 39 out of 39 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| web/src/stores/approvals.ts | Updates WS handling to use approval_id and fetch full items for list consistency. |
| web/src/tests/stores/approvals.test.ts | Adds async WS handler coverage and error-path tests. |
| tests/unit/tools/test_invoker_escalation.py | Adds unit tests for ToolInvoker escalation tracking behavior. |
| tests/unit/tools/test_approval_tool.py | Adds unit tests for RequestHumanApprovalTool validation, persistence, and risk classification. |
| tests/unit/persistence/test_migrations_v2.py | Updates schema-version assertions for the SQLite migration bump. |
| tests/unit/observability/test_events.py | Extends event discovery expectations for the new approval_gate domain. |
| tests/unit/engine/test_security_factory.py | Adds coverage for extracted security factory and approval-tool registry wiring. |
| tests/unit/engine/test_loop_helpers_approval.py | Tests loop helper parking behavior when escalations are present. |
| tests/unit/engine/test_approval_gate_models.py | Tests validation/immutability of escalation and resume payload models. |
| tests/unit/engine/test_approval_gate.py | Tests ApprovalGate park/resume flows and resume-message construction. |
| tests/unit/core/test_validation.py | Adds tests for shared category:action action_type validation helper. |
| tests/unit/api/test_dto.py | Updates DTO tests for action_type validation and request schema changes. |
| tests/unit/api/test_approval_store.py | Adds tests for save_if_pending, combined filters, and expiration callback behavior. |
| tests/unit/api/controllers/test_approvals_helpers.py | Tests approvals controller helpers (auth checks, publishing, resume intent). |
| tests/unit/api/controllers/test_approvals.py | Updates controller tests for action_type format and removal of requested_by from payload. |
| src/ai_company/tools/registry.py | Adds all_tools() to support composing registries with the approval tool. |
| src/ai_company/tools/invoker.py | Tracks ESCALATE and parking-metadata escalations via pending_escalations. |
| src/ai_company/tools/approval_tool.py | Implements request_human_approval tool and persists ApprovalItems with parking metadata. |
| src/ai_company/tools/init.py | Exports the new approval tool. |
| src/ai_company/security/timeout/parked_context.py | Makes parked context task_id nullable for taskless agents. |
| src/ai_company/security/timeout/park_service.py | Allows task_id=None during context parking. |
| src/ai_company/persistence/sqlite/migrations.py | Bumps schema to v6 and alters parked_contexts schema for nullable task_id. |
| src/ai_company/observability/events/approval_gate.py | Adds structured event constants for approval gate lifecycle. |
| src/ai_company/engine/react_loop.py | Wires optional ApprovalGate into ReAct tool execution path. |
| src/ai_company/engine/plan_execute_loop.py | Wires optional ApprovalGate into Plan/Execute step tool execution path. |
| src/ai_company/engine/loop_helpers.py | Parks execution and returns PARKED termination when escalations require approval. |
| src/ai_company/engine/approval_gate_models.py | Adds Pydantic models for escalation info and resume payloads. |
| src/ai_company/engine/approval_gate.py | Implements context park/persist/resume coordination and safe resume-message formatting. |
| src/ai_company/engine/agent_engine.py | Constructs ApprovalGate, wires it into default loop, and injects approval tool into registry. |
| src/ai_company/engine/_security_factory.py | Extracts security interceptor + approval-tool registry composition from AgentEngine. |
| src/ai_company/engine/init.py | Exposes ApprovalGate and related models at package level. |
| src/ai_company/core/validation.py | Adds shared is_valid_action_type() helper. |
| src/ai_company/api/state.py | Stores/exports an optional ApprovalGate on app state. |
| src/ai_company/api/dto.py | Removes requested_by from create DTO, adds action_type validator, and bounds approve comment length. |
| src/ai_company/api/controllers/approvals.py | Binds requested_by to auth user, adds concurrency guard, and emits resume-intent signals. |
| src/ai_company/api/approval_store.py | Adds save_if_pending() and hardens expiration callback handling. |
| docs/design/engine.md | Documents approval-gate parking behavior in the run loop. |
| README.md | Updates status to reflect approval gates implemented. |
| CLAUDE.md | Updates repository/module overview and observability event guidance for approval gate. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/ai_company/api/approval_store.py
Outdated
| """Update only if the stored item is still PENDING. | ||
|
|
||
| Returns the saved item on success, or ``None`` if the stored | ||
| item is no longer PENDING (concurrent decision detected). |
| "INSERT OR IGNORE INTO parked_contexts_new SELECT * FROM parked_contexts", | ||
| "DROP TABLE IF EXISTS parked_contexts", | ||
| "ALTER TABLE parked_contexts_new RENAME TO parked_contexts", |
| saved = await app_state.approval_store.save_if_pending(updated) | ||
| if saved is None: | ||
| msg = "Approval has already been decided by another request" | ||
| logger.warning( | ||
| API_RESOURCE_NOT_FOUND, | ||
| resource="approval", | ||
| id=approval_id, | ||
| note="disappeared between get and save", | ||
| API_APPROVAL_CONFLICT, | ||
| approval_id=approval_id, | ||
| note=msg, | ||
| ) | ||
| msg = f"Approval {approval_id!r} not found" | ||
| raise NotFoundError(msg) | ||
| raise ConflictError(msg) |
| saved = await app_state.approval_store.save_if_pending(updated) | ||
| if saved is None: | ||
| msg = "Approval has already been decided by another request" | ||
| logger.warning( | ||
| API_RESOURCE_NOT_FOUND, | ||
| resource="approval", | ||
| id=approval_id, | ||
| note="disappeared between get and save", | ||
| API_APPROVAL_CONFLICT, | ||
| approval_id=approval_id, | ||
| note=msg, | ||
| ) | ||
| msg = f"Approval {approval_id!r} not found" | ||
| raise NotFoundError(msg) | ||
| raise ConflictError(msg) |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/ai_company/engine/agent_engine.py (1)
631-667: 🛠️ Refactor suggestion | 🟠 MajorExtract the approval wiring out of
AgentEngine.This file is now 871 lines, above the repo's 800-line cap, and the new gate/tool/security factory logic widens an already broad orchestrator. Moving this wiring into a dedicated builder/module would keep execution flow readable and the class within policy.
As per coding guidelines, "Functions must be less than 50 lines; files must be less than 800 lines."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/engine/agent_engine.py` around lines 631 - 667, Extract the approval/permission/security wiring out of AgentEngine by creating a dedicated builder (e.g., ToolInvokerFactory or ToolBuilder) that encapsulates the logic currently in _make_security_interceptor and _make_tool_invoker: move the calls to make_security_interceptor, registry_with_approval_tool, ToolPermissionChecker.from_permissions and construction of ToolInvoker (including use of self._approval_store, self._tool_registry, and identity/task_id/effective_autonomy) into that new class, give it a clear API (e.g., build_tool_invoker(identity, task_id=None, effective_autonomy=None) and optionally build_security_interceptor(effective_autonomy=None)), update AgentEngine._make_tool_invoker and _make_security_interceptor to delegate to the new builder, and ensure no behavioral changes (preserve parameters and return types for ToolInvoker and SecurityInterceptionStrategy).
♻️ Duplicate comments (2)
src/ai_company/tools/approval_tool.py (1)
113-124:⚠️ Potential issue | 🟠 MajorReject malformed tool args instead of coercing them.
str()turnsNone, numbers, and objects into persisted approval text, so invalid tool payloads can create real approval items instead of returning a validation error. Validateaction_type,title, anddescriptionas strings at the boundary and log the rejection before returning.As per coding guidelines, "Validate at system boundaries: user input, external APIs, config files."Possible fix
- try: - action_type = str(arguments["action_type"]) - title = str(arguments["title"]) - description = str(arguments["description"]) - except KeyError as exc: + action_type = arguments.get("action_type") + title = arguments.get("title") + description = arguments.get("description") + if not ( + isinstance(action_type, str) + and isinstance(title, str) + and isinstance(description, str) + ): + logger.warning( + APPROVAL_GATE_ESCALATION_FAILED, + agent_id=self._agent_id, + note="Invalid request_human_approval arguments", + ) return ToolExecutionResult( content=( - f"Missing required argument: {exc}. " - f"Required: action_type, title, description" + "Invalid arguments: action_type, title, and description " + "must be strings." ), is_error=True, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/tools/approval_tool.py` around lines 113 - 124, The current code coerces non-string values via str() for arguments["action_type"], ["title"], and ["description"], allowing None/number/object payloads to create approvals; change this to explicitly validate that arguments contains these keys and that each value is an instance of str (and non-empty if desired) before proceeding, log a clear rejection message (including which arg failed and its type/value) using the module/class logger, and return the existing ToolExecutionResult error path when validation fails; reference the arguments dict, the action_type/title/description symbols, and ToolExecutionResult to locate where to add the checks and logging.src/ai_company/engine/approval_gate.py (1)
260-287:⚠️ Potential issue | 🟠 MajorTreat parked-record cleanup failures as fatal.
resume_context()still returns success afterdelete()raises or returnsFalse, which leaves the parked row available for a later retry and can replay the same execution twice._cleanup_parked()should raise (or return a hard failure) so the caller never resumes work until cleanup succeeds.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/engine/approval_gate.py` around lines 260 - 287, The _cleanup_parked method currently logs failures from self._parked_context_repo.delete(parked.id) but does not propagate them, allowing resume_context to proceed; change _cleanup_parked (and its except handling) so any exception from _parked_context_repo.delete is re-raised after logging (fix the except syntax to "except (MemoryError, RecursionError):" for proper re-raise), and if delete returns False treat that as fatal by logging and then raising a clear exception (e.g., RuntimeError or a custom CleanupError) with approval_id and parked.id so callers (resume_context) never continue when cleanup failed; keep the APPROVAL_GATE_RESUME_DELETE_FAILED log messages for context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/engine/agent_engine.py`:
- Around line 609-625: The current _make_approval_gate() enables ApprovalGate
whenever _approval_store is present even if _parked_context_repo is None, which
would create non-resumable "parked" approvals; update _make_approval_gate to
require both _approval_store and _parked_context_repo before constructing
ApprovalGate: if _approval_store is None return None as before, but if
_approval_store is present and _parked_context_repo is None raise a clear
RuntimeError (or otherwise keep approval gating disabled) so we fail fast
instead of creating an ApprovalGate with parked_context_repo=None; reference the
_make_approval_gate function, ApprovalGate class, _approval_store and
_parked_context_repo when making the change.
In `@src/ai_company/persistence/sqlite/migrations.py`:
- Around line 224-241: The v6 migration (_V6_STATEMENTS) assumes parked_contexts
exists and always runs the INSERT/DROP/RENAME sequence, which breaks recovery
when parked_contexts was dropped but parked_contexts_new was not renamed; update
the migration logic in the _apply_v6 path to explicitly handle three states: (1)
normal pre-migration where parked_contexts exists — create parked_contexts_new,
copy rows, drop old table, rename new; (2) interrupted state where
parked_contexts is missing but parked_contexts_new already exists — skip the
INSERT and DROP and just ensure parked_contexts_new is renamed to
parked_contexts (or verify indices); and (3) already-migrated state where
parked_contexts exists and parked_contexts_new does not — no-op; implement these
checks by querying sqlite_master or using "SELECT name FROM sqlite_master WHERE
type='table' AND name IN ('parked_contexts','parked_contexts_new')" and
conditionally executing the appropriate statements instead of blindly running
the current _V6_STATEMENTS sequence.
In `@tests/unit/persistence/test_migrations_v2.py`:
- Around line 31-32: Update the test_schema_version_is_five (and the similar
test at lines 53-61) to actually verify the v5→v6 migration behavior: create a
pre-migration parked_contexts table with task_id NOT NULL, call the migration
runner (run_migrations or the test helper that invokes _apply_v6), then query
PRAGMA table_info('parked_contexts') and assert that the returned row for column
"task_id" has notnull == 0; keep the existing SCHEMA_VERSION assertion but add
this setup, migration invocation, and PRAGMA-based assertion to ensure _apply_v6
truly makes parked_contexts.task_id nullable.
In `@web/src/stores/approvals.ts`:
- Around line 78-87: The code reads activeFilters.value before awaiting
approvalsApi.getApproval(approvalId), so if filters change while the network
call is in flight you may mutate the wrong list; after the await re-read
activeFilters.value and if it no longer matches the value used to choose the
branch call fetchApprovals(activeFilters.value) (or fetchApprovals() when falsy)
instead of mutating approvals.value/total.value; update the same logic in the
other occurrence (lines ~102-119) and reference fetchApprovals,
approvalsApi.getApproval, activeFilters.value, approvals.value, and total.value
when making the change.
- Around line 78-80: WS-triggered fetchApprovals() calls can race with
user-driven requests and allow stale, slower responses to overwrite newer state;
update fetchApprovals to include request sequencing or cancellation so only the
latest response updates approvals, total, and loading (e.g., maintain a
per-store requestId/counter or an AbortController token checked on response and
incremented on each new invocation) and ensure WS handlers that call
fetchApprovals honor that mechanism so older responses are ignored and loading
reflects the real in-flight status.
---
Outside diff comments:
In `@src/ai_company/engine/agent_engine.py`:
- Around line 631-667: Extract the approval/permission/security wiring out of
AgentEngine by creating a dedicated builder (e.g., ToolInvokerFactory or
ToolBuilder) that encapsulates the logic currently in _make_security_interceptor
and _make_tool_invoker: move the calls to make_security_interceptor,
registry_with_approval_tool, ToolPermissionChecker.from_permissions and
construction of ToolInvoker (including use of self._approval_store,
self._tool_registry, and identity/task_id/effective_autonomy) into that new
class, give it a clear API (e.g., build_tool_invoker(identity, task_id=None,
effective_autonomy=None) and optionally
build_security_interceptor(effective_autonomy=None)), update
AgentEngine._make_tool_invoker and _make_security_interceptor to delegate to the
new builder, and ensure no behavioral changes (preserve parameters and return
types for ToolInvoker and SecurityInterceptionStrategy).
---
Duplicate comments:
In `@src/ai_company/engine/approval_gate.py`:
- Around line 260-287: The _cleanup_parked method currently logs failures from
self._parked_context_repo.delete(parked.id) but does not propagate them,
allowing resume_context to proceed; change _cleanup_parked (and its except
handling) so any exception from _parked_context_repo.delete is re-raised after
logging (fix the except syntax to "except (MemoryError, RecursionError):" for
proper re-raise), and if delete returns False treat that as fatal by logging and
then raising a clear exception (e.g., RuntimeError or a custom CleanupError)
with approval_id and parked.id so callers (resume_context) never continue when
cleanup failed; keep the APPROVAL_GATE_RESUME_DELETE_FAILED log messages for
context.
In `@src/ai_company/tools/approval_tool.py`:
- Around line 113-124: The current code coerces non-string values via str() for
arguments["action_type"], ["title"], and ["description"], allowing
None/number/object payloads to create approvals; change this to explicitly
validate that arguments contains these keys and that each value is an instance
of str (and non-empty if desired) before proceeding, log a clear rejection
message (including which arg failed and its type/value) using the module/class
logger, and return the existing ToolExecutionResult error path when validation
fails; reference the arguments dict, the action_type/title/description symbols,
and ToolExecutionResult to locate where to add the checks and logging.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4ee0ae83-5f68-4c43-9eb1-378901c8917a
📒 Files selected for processing (6)
src/ai_company/engine/agent_engine.pysrc/ai_company/engine/approval_gate.pysrc/ai_company/persistence/sqlite/migrations.pysrc/ai_company/tools/approval_tool.pytests/unit/persistence/test_migrations_v2.pyweb/src/stores/approvals.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Agent
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Greptile Review
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Python version 3.14+ with PEP 649 native lazy annotations required.
Do NOT usefrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax:except A, B:(no parentheses) — ruff enforces this on Python 3.14.
Line length must be 88 characters, enforced by ruff.
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:example-provider,example-large-001,example-medium-001,example-small-001. Tests must usetest-provider,test-small-001, etc.
Files:
src/ai_company/persistence/sqlite/migrations.pytests/unit/persistence/test_migrations_v2.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/tools/approval_tool.pysrc/ai_company/engine/approval_gate.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: All public functions must have type hints. Enforce via mypy strict mode.
Google-style docstrings required on public classes and functions, enforced by ruff D rules.
Create new objects instead of mutating existing ones. For non-Pydantic internal collections, usecopy.deepcopy()at construction plusMappingProxyTypewrapping for read-only enforcement.
Fordict/listfields in frozen Pydantic models, rely onfrozen=Truefor field reassignment prevention and usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 withBaseModel,model_validator,computed_field,ConfigDict. Use@computed_fieldfor derived values instead of storing redundant fields (e.g.,TokenUsage.total_tokens).
UseNotBlankStrfromcore.typesfor all identifier/name fields in Pydantic models, including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants, instead of manual whitespace validators.
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task.
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly, never silently swallow exceptions.
Validate at system boundaries: user input, external APIs, config files.
NEVER useimport logging,logging.getLogger(), orprint()in application code.
Always useloggeras the variable name for loggers (not_logger, notlog).
Event names must always use constants from domain-specific modules underai_company.observability.events(e.g.,PROVIDER_CALL_STARTfromevents.provider,BUDGET_RECORD_ADDEDfromevents.budget). Import dire...
Files:
src/ai_company/persistence/sqlite/migrations.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/tools/approval_tool.pysrc/ai_company/engine/approval_gate.py
src/ai_company/**/[!_]*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Every module with business logic MUST have:
from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Files:
src/ai_company/persistence/sqlite/migrations.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/tools/approval_tool.pysrc/ai_company/engine/approval_gate.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Tests must use markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Test coverage minimum: 80% (enforced in CI).
Async tests: useasyncio_mode = "auto"— no manual@pytest.mark.asyncioneeded.
Test timeout: 30 seconds per test.
Prefer@pytest.mark.parametrizefor testing similar cases.
Files:
tests/unit/persistence/test_migrations_v2.py
web/src/**/*.{vue,ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Web dashboard: Vue 3 + PrimeVue + Tailwind CSS, organized by feature in src/components/, src/stores/, src/views/. Enforce with ESLint and vue-tsc type-checking.
Files:
web/src/stores/approvals.ts
web/src/stores/**/*.ts
📄 CodeRabbit inference engine (CLAUDE.md)
Frontend state management: Pinia stores organized by feature in src/stores/
Files:
web/src/stores/approvals.ts
🧠 Learnings (5)
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Event names must always use constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/ai_company/engine/agent_engine.py
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to docs/design/**/*.md : When approved deviations occur, update the relevant `docs/design/` page to reflect the new reality.
Applied to files:
web/src/stores/approvals.ts
📚 Learning: 2026-03-13T21:03:58.907Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.907Z
Learning: When review agents find valid issues (including pre-existing issues in surrounding code, suggestions, and findings adjacent to the PR's changes), fix them all. No deferring, no "out of scope" skipping.
Applied to files:
web/src/stores/approvals.ts
📚 Learning: 2026-03-13T21:03:58.907Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.907Z
Learning: Applies to web/src/stores/**/*.ts : Frontend state management: Pinia stores organized by feature in src/stores/
Applied to files:
web/src/stores/approvals.ts
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Functions must be less than 50 lines; files must be less than 800 lines.
Applied to files:
src/ai_company/engine/approval_gate.py
🧬 Code graph analysis (4)
tests/unit/persistence/test_migrations_v2.py (2)
src/ai_company/persistence/sqlite/migrations.py (1)
get_user_version(247-251)tests/unit/persistence/sqlite/conftest.py (1)
memory_db(15-22)
src/ai_company/engine/agent_engine.py (3)
src/ai_company/engine/_security_factory.py (1)
make_security_interceptor(39-117)src/ai_company/engine/approval_gate.py (1)
ApprovalGate(40-322)src/ai_company/security/timeout/park_service.py (1)
ParkService(29-144)
web/src/stores/approvals.ts (1)
web/src/api/types.ts (1)
WsEvent(519-524)
src/ai_company/tools/approval_tool.py (6)
src/ai_company/core/enums.py (1)
ToolCategory(294-308)src/ai_company/core/validation.py (1)
is_valid_action_type(6-19)src/ai_company/tools/base.py (5)
BaseTool(57-184)ToolExecutionResult(25-54)description(138-140)category(128-130)action_type(133-135)src/ai_company/api/approval_store.py (2)
ApprovalStore(27-181)add(42-59)src/ai_company/security/timeout/risk_tier_classifier.py (1)
DefaultRiskTierClassifier(62-101)src/ai_company/memory/errors.py (1)
MemoryError(13-14)
🔇 Additional comments (2)
web/src/stores/approvals.ts (1)
56-71: Good runtime guard for the newapproval_idpayload.Switching to
payload.approval_idand returning early on malformed payloads keeps the store aligned with the backend WS contract and avoids mutating state from bad events.src/ai_company/engine/approval_gate.py (1)
289-322: Good separation of control data from untrusted approval text.Keeping the decision in a fixed system wrapper and labeling the human reason as untrusted data is the right shape for resume injection hardening.
| def _make_approval_gate(self) -> ApprovalGate | None: | ||
| """Build an ApprovalGate if an approval store is configured. | ||
|
|
||
| Raises: | ||
| ExecutionStateError: If effective_autonomy is provided but | ||
| no SecurityConfig is configured — autonomy cannot be | ||
| enforced without the security subsystem. | ||
| Returns ``None`` when no approval store is available — the | ||
| execution loop skips approval-gate checks in that case. | ||
| """ | ||
| if self._security_config is None: | ||
| if effective_autonomy is not None: | ||
| msg = ( | ||
| "effective_autonomy cannot be enforced without " | ||
| "SecurityConfig — configure security or remove autonomy" | ||
| ) | ||
| logger.error(SECURITY_DISABLED, note=msg) | ||
| raise ExecutionStateError(msg) | ||
| logger.warning( | ||
| SECURITY_DISABLED, | ||
| note="No SecurityConfig provided — all security checks skipped", | ||
| ) | ||
| return None | ||
| if not self._security_config.enabled: | ||
| if effective_autonomy is not None: | ||
| msg = "effective_autonomy cannot be enforced when security is disabled" | ||
| logger.error(SECURITY_DISABLED, note=msg) | ||
| raise ExecutionStateError(msg) | ||
| if self._approval_store is None: | ||
| return None | ||
|
|
||
| cfg = self._security_config | ||
| re_cfg = cfg.rule_engine | ||
| policy_validator = PolicyValidator( | ||
| hard_deny_action_types=frozenset(cfg.hard_deny_action_types), | ||
| auto_approve_action_types=frozenset(cfg.auto_approve_action_types), | ||
| ) | ||
| # Build the detector list respecting config flags. | ||
| detectors: list[SecurityRule] = [policy_validator] | ||
| if re_cfg.credential_patterns_enabled: | ||
| detectors.append(CredentialDetector()) | ||
| if re_cfg.path_traversal_detection_enabled: | ||
| detectors.append(PathTraversalDetector()) | ||
| if re_cfg.destructive_op_detection_enabled: | ||
| detectors.append(DestructiveOpDetector()) | ||
| if re_cfg.data_leak_detection_enabled: | ||
| detectors.append(DataLeakDetector()) | ||
|
|
||
| rule_engine = RuleEngine( | ||
| rules=tuple(detectors), | ||
| risk_classifier=RiskClassifier(), | ||
| config=re_cfg, | ||
| ) | ||
| return SecOpsService( | ||
| config=cfg, | ||
| rule_engine=rule_engine, | ||
| audit_log=self._audit_log, | ||
| output_scanner=OutputScanner(), | ||
| from ai_company.security.timeout.park_service import ( # noqa: PLC0415 | ||
| ParkService, | ||
| ) | ||
|
|
||
| return ApprovalGate( | ||
| park_service=ParkService(), | ||
| parked_context_repo=self._parked_context_repo, | ||
| ) |
There was a problem hiding this comment.
Don't enable approval parking without persisted parked contexts.
_make_approval_gate() creates a live gate whenever approval_store exists, but ApprovalGate explicitly treats parked_context_repo=None as non-resumable. In this configuration the engine can return PARKED and queue an approval that can never restore the execution. Fail fast here, or keep approval gating disabled until both components are configured.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/engine/agent_engine.py` around lines 609 - 625, The current
_make_approval_gate() enables ApprovalGate whenever _approval_store is present
even if _parked_context_repo is None, which would create non-resumable "parked"
approvals; update _make_approval_gate to require both _approval_store and
_parked_context_repo before constructing ApprovalGate: if _approval_store is
None return None as before, but if _approval_store is present and
_parked_context_repo is None raise a clear RuntimeError (or otherwise keep
approval gating disabled) so we fail fast instead of creating an ApprovalGate
with parked_context_repo=None; reference the _make_approval_gate function,
ApprovalGate class, _approval_store and _parked_context_repo when making the
change.
| _V6_STATEMENTS: Sequence[str] = ( | ||
| # ── Make parked_contexts.task_id nullable ───────────── | ||
| """\ | ||
| CREATE TABLE IF NOT EXISTS parked_contexts_new ( | ||
| id TEXT PRIMARY KEY, | ||
| execution_id TEXT NOT NULL, | ||
| agent_id TEXT NOT NULL, | ||
| task_id TEXT, | ||
| approval_id TEXT NOT NULL, | ||
| parked_at TEXT NOT NULL, | ||
| context_json TEXT NOT NULL, | ||
| metadata TEXT NOT NULL DEFAULT '{}' | ||
| )""", | ||
| "INSERT OR IGNORE INTO parked_contexts_new SELECT * FROM parked_contexts", | ||
| "DROP TABLE IF EXISTS parked_contexts", | ||
| "ALTER TABLE parked_contexts_new RENAME TO parked_contexts", | ||
| "CREATE INDEX IF NOT EXISTS idx_pc_agent_id ON parked_contexts(agent_id)", | ||
| "CREATE INDEX IF NOT EXISTS idx_pc_approval_id ON parked_contexts(approval_id)", |
There was a problem hiding this comment.
Make the v6 rebuild rerunnable from a partially applied v5→v6 upgrade.
If startup dies after Line 238 drops parked_contexts but before Line 239 renames parked_contexts_new, user_version is still 5 and the next run executes only _apply_v6(). Line 237 then fails with no such table: parked_contexts, so recovery requires manual DB repair. Handle the parked_contexts_new exists / parked_contexts missing state explicitly instead of always copying from the old table.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/persistence/sqlite/migrations.py` around lines 224 - 241, The
v6 migration (_V6_STATEMENTS) assumes parked_contexts exists and always runs the
INSERT/DROP/RENAME sequence, which breaks recovery when parked_contexts was
dropped but parked_contexts_new was not renamed; update the migration logic in
the _apply_v6 path to explicitly handle three states: (1) normal pre-migration
where parked_contexts exists — create parked_contexts_new, copy rows, drop old
table, rename new; (2) interrupted state where parked_contexts is missing but
parked_contexts_new already exists — skip the INSERT and DROP and just ensure
parked_contexts_new is renamed to parked_contexts (or verify indices); and (3)
already-migrated state where parked_contexts exists and parked_contexts_new does
not — no-op; implement these checks by querying sqlite_master or using "SELECT
name FROM sqlite_master WHERE type='table' AND name IN
('parked_contexts','parked_contexts_new')" and conditionally executing the
appropriate statements instead of blindly running the current _V6_STATEMENTS
sequence.
| if (activeFilters.value) { | ||
| // Filters active — re-fetch the filtered query to stay consistent | ||
| await fetchApprovals(activeFilters.value) | ||
| } else { | ||
| try { | ||
| const item = await approvalsApi.getApproval(approvalId) | ||
| // Re-check after async fetch to prevent duplicate insertion | ||
| if (!approvals.value.some((a) => a.id === approvalId)) { | ||
| approvals.value = [item, ...approvals.value] | ||
| total.value++ |
There was a problem hiding this comment.
Re-check the current filters after getApproval() resolves.
The branch is chosen from activeFilters.value before the network await. If the user changes queries while getApproval() is in flight, the old unfiltered path can still prepend, update, or remove against the new filtered list and leave the UI in the wrong state. Re-read activeFilters.value immediately before any local mutation and refresh the current query when it changed.
🛠️ Suggested fix
try {
const item = await approvalsApi.getApproval(approvalId)
+ if (activeFilters.value) {
+ await fetchApprovals(activeFilters.value)
+ return
+ }
// Re-check after async fetch to prevent duplicate insertion
if (!approvals.value.some((a) => a.id === approvalId)) {
approvals.value = [item, ...approvals.value]
total.value++
}
@@
try {
const updated = await approvalsApi.getApproval(approvalId)
+ if (activeFilters.value) {
+ await fetchApprovals(activeFilters.value)
+ return
+ }
approvals.value = approvals.value.map((a) =>
a.id === approvalId ? updated : a,
)Also applies to: 102-119
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web/src/stores/approvals.ts` around lines 78 - 87, The code reads
activeFilters.value before awaiting approvalsApi.getApproval(approvalId), so if
filters change while the network call is in flight you may mutate the wrong
list; after the await re-read activeFilters.value and if it no longer matches
the value used to choose the branch call fetchApprovals(activeFilters.value) (or
fetchApprovals() when falsy) instead of mutating approvals.value/total.value;
update the same logic in the other occurrence (lines ~102-119) and reference
fetchApprovals, approvalsApi.getApproval, activeFilters.value, approvals.value,
and total.value when making the change.
| if (activeFilters.value) { | ||
| // Filters active — re-fetch the filtered query to stay consistent | ||
| await fetchApprovals(activeFilters.value) |
There was a problem hiding this comment.
Background refreshes need stale-response protection.
These WS-triggered fetchApprovals() calls can overlap with a later user-driven fetch. Right now the slower, older response can land last, overwrite a newer query, and clear loading while the newer request is still in flight. Add request sequencing or cancellation before applying approvals, total, and loading.
🛠️ Suggested fix in `fetchApprovals()`
+ let fetchSeq = 0
+
async function fetchApprovals(filters?: ApprovalFilters) {
+ const seq = ++fetchSeq
loading.value = true
error.value = null
activeFilters.value = filters ? { ...filters } : undefined
try {
const result = await approvalsApi.listApprovals(filters)
+ if (seq !== fetchSeq) return
approvals.value = result.data
total.value = result.total
} catch (err) {
+ if (seq !== fetchSeq) return
error.value = getErrorMessage(err)
} finally {
- loading.value = false
+ if (seq === fetchSeq) loading.value = false
}
}Also applies to: 102-104
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web/src/stores/approvals.ts` around lines 78 - 80, WS-triggered
fetchApprovals() calls can race with user-driven requests and allow stale,
slower responses to overwrite newer state; update fetchApprovals to include
request sequencing or cancellation so only the latest response updates
approvals, total, and loading (e.g., maintain a per-store requestId/counter or
an AbortController token checked on response and incremented on each new
invocation) and ensure WS handlers that call fetchApprovals honor that mechanism
so older responses are ignored and loading reflects the real in-flight status.
- Fix double APPROVAL_GATE_RESUME_DELETE_FAILED emission: early return after exception log prevents fallthrough to if-not-deleted branch - Add APPROVAL_GATE_LOOP_WIRING_WARNING constant for external loop misconfiguration (was misusing APPROVAL_GATE_INITIALIZED) - Wire APPROVAL_GATE_RISK_CLASSIFIED to classifier success path; remove misleading usage for "no classifier" info log (was RISK_CLASSIFY_FAILED) - Add APPROVAL_GATE_LOOP_WIRING_WARNING to CLAUDE.md event reference
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
src/ai_company/engine/agent_engine.py (1)
609-625:⚠️ Potential issue | 🟠 MajorDo not enable approval parking without parked-context persistence.
This wiring allows
PARKEDtermination withparked_context_repo=None, creating non-resumable parked executions.🔧 Suggested fix
def _make_approval_gate(self) -> ApprovalGate | None: """Build an ApprovalGate if an approval store is configured. @@ if self._approval_store is None: return None + if self._parked_context_repo is None: + msg = ( + "approval_store configured without parked_context_repo; " + "PARKED executions would be non-resumable" + ) + logger.error( + APPROVAL_GATE_LOOP_WIRING_WARNING, + note=msg, + ) + raise RuntimeError(msg) from ai_company.security.timeout.park_service import ( # noqa: PLC0415 ParkService, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/engine/agent_engine.py` around lines 609 - 625, The current _make_approval_gate creates an ApprovalGate when _approval_store is set even if self._parked_context_repo is None, allowing non-resumable PARKED executions; change _make_approval_gate (and its return logic) to only construct and return an ApprovalGate when both self._approval_store and self._parked_context_repo are present (otherwise return None or log/raise as appropriate), referencing ApprovalGate, ParkService, self._approval_store and self._parked_context_repo so parked-context persistence is enforced before enabling approval parking.src/ai_company/engine/approval_gate.py (1)
260-287:⚠️ Potential issue | 🔴 CriticalFail resume when parked-context cleanup does not succeed.
This path logs delete failures but still lets resume succeed, which can replay the same parked execution later.
🔧 Suggested fix
async def _cleanup_parked( self, parked: ParkedContext, approval_id: str, ) -> None: @@ try: deleted = await self._parked_context_repo.delete(parked.id) except MemoryError, RecursionError: raise - except Exception: + except Exception as exc: logger.exception( APPROVAL_GATE_RESUME_DELETE_FAILED, approval_id=approval_id, parked_id=parked.id, note="Context resumed but parked record not cleaned up", ) - return + msg = ( + f"Failed to delete parked context for approval_id={approval_id!r}, " + f"parked_id={parked.id!r}" + ) + raise RuntimeError(msg) from exc if not deleted: - logger.warning( + logger.error( APPROVAL_GATE_RESUME_DELETE_FAILED, approval_id=approval_id, parked_id=parked.id, note="delete() returned False — may cause duplicate resume", ) + msg = ( + f"Delete returned False for approval_id={approval_id!r}, " + f"parked_id={parked.id!r}" + ) + raise RuntimeError(msg)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/engine/approval_gate.py` around lines 260 - 287, The _cleanup_parked method currently logs failures from self._parked_context_repo.delete(parked.id) but allows resume to succeed, which can cause duplicate replays; change it so that any delete failure or a False return value causes the resume to fail by propagating an exception (or raising a specific ResumeCleanupError) instead of only logging—update the except Exception block (and the branch where not deleted) to raise after logging (include APPROVAL_GATE_RESUME_DELETE_FAILED, approval_id and parked_id in the log) so callers of _cleanup_parked/_resume see the failure and abort replay.src/ai_company/tools/approval_tool.py (1)
114-117:⚠️ Potential issue | 🟠 MajorAvoid coercing non-string tool inputs with
str(...).This path still accepts invalid argument types (e.g., dict/int) by coercing them to strings, which weakens the tool contract at the input boundary.
🔧 Suggested fix
try: - action_type = str(arguments["action_type"]) - title = str(arguments["title"]) - description = str(arguments["description"]) + action_type_raw = arguments["action_type"] + title_raw = arguments["title"] + description_raw = arguments["description"] except KeyError as exc: return ToolExecutionResult( content=( f"Missing required argument: {exc}. " f"Required: action_type, title, description" ), is_error=True, ) + if ( + not isinstance(action_type_raw, str) + or not isinstance(title_raw, str) + or not isinstance(description_raw, str) + ): + logger.warning( + APPROVAL_GATE_ESCALATION_FAILED, + agent_id=self._agent_id, + note="Invalid request_human_approval argument types", + ) + return ToolExecutionResult( + content=( + "Invalid arguments: action_type, title, and description " + "must be strings." + ), + is_error=True, + ) + action_type = action_type_raw + title = title_raw + description = description_rawAs per coding guidelines, "Validate at system boundaries: user input, external APIs, config files."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/tools/approval_tool.py` around lines 114 - 117, The code currently coerces inputs with str(arguments["action_type"/"title"/"description"]) which masks invalid types; instead validate the arguments dict entries explicitly (e.g., check presence and use isinstance(arguments["action_type"], str), isinstance(arguments["title"], str), isinstance(arguments["description"], str)), and if any check fails raise a clear ValueError or return a structured error explaining the invalid parameter; update the logic that assigns action_type, title, and description to perform these checks before assignment and do not use str(...) coercion.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CLAUDE.md`:
- Line 154: The long example list of event constants is missing a separating
comma near the end; update the sentence so every listed constant is followed by
a comma (specifically add a comma after `APPROVAL_GATE_LOOP_WIRING_WARNING` from
`events.approval_gate` and ensure other entries like
`APPROVAL_GATE_NO_PARKED_CONTEXT`, `APPROVAL_GATE_LOOP_WIRING_WARNING` are
comma-separated) so the examples remain readable and properly punctuated in the
line that imports EVENT_CONSTANTs from ai_company.observability.events domains.
---
Duplicate comments:
In `@src/ai_company/engine/agent_engine.py`:
- Around line 609-625: The current _make_approval_gate creates an ApprovalGate
when _approval_store is set even if self._parked_context_repo is None, allowing
non-resumable PARKED executions; change _make_approval_gate (and its return
logic) to only construct and return an ApprovalGate when both
self._approval_store and self._parked_context_repo are present (otherwise return
None or log/raise as appropriate), referencing ApprovalGate, ParkService,
self._approval_store and self._parked_context_repo so parked-context persistence
is enforced before enabling approval parking.
In `@src/ai_company/engine/approval_gate.py`:
- Around line 260-287: The _cleanup_parked method currently logs failures from
self._parked_context_repo.delete(parked.id) but allows resume to succeed, which
can cause duplicate replays; change it so that any delete failure or a False
return value causes the resume to fail by propagating an exception (or raising a
specific ResumeCleanupError) instead of only logging—update the except Exception
block (and the branch where not deleted) to raise after logging (include
APPROVAL_GATE_RESUME_DELETE_FAILED, approval_id and parked_id in the log) so
callers of _cleanup_parked/_resume see the failure and abort replay.
In `@src/ai_company/tools/approval_tool.py`:
- Around line 114-117: The code currently coerces inputs with
str(arguments["action_type"/"title"/"description"]) which masks invalid types;
instead validate the arguments dict entries explicitly (e.g., check presence and
use isinstance(arguments["action_type"], str), isinstance(arguments["title"],
str), isinstance(arguments["description"], str)), and if any check fails raise a
clear ValueError or return a structured error explaining the invalid parameter;
update the logic that assigns action_type, title, and description to perform
these checks before assignment and do not use str(...) coercion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8d77745e-ccbc-4515-ac50-fffa0b25059a
📒 Files selected for processing (5)
CLAUDE.mdsrc/ai_company/engine/agent_engine.pysrc/ai_company/engine/approval_gate.pysrc/ai_company/observability/events/approval_gate.pysrc/ai_company/tools/approval_tool.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build Web
- GitHub Check: Build Backend
- GitHub Check: Greptile Review
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Python version 3.14+ with PEP 649 native lazy annotations required.
Do NOT usefrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax:except A, B:(no parentheses) — ruff enforces this on Python 3.14.
Line length must be 88 characters, enforced by ruff.
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:example-provider,example-large-001,example-medium-001,example-small-001. Tests must usetest-provider,test-small-001, etc.
Files:
src/ai_company/engine/agent_engine.pysrc/ai_company/engine/approval_gate.pysrc/ai_company/observability/events/approval_gate.pysrc/ai_company/tools/approval_tool.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: All public functions must have type hints. Enforce via mypy strict mode.
Google-style docstrings required on public classes and functions, enforced by ruff D rules.
Create new objects instead of mutating existing ones. For non-Pydantic internal collections, usecopy.deepcopy()at construction plusMappingProxyTypewrapping for read-only enforcement.
Fordict/listfields in frozen Pydantic models, rely onfrozen=Truefor field reassignment prevention and usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 withBaseModel,model_validator,computed_field,ConfigDict. Use@computed_fieldfor derived values instead of storing redundant fields (e.g.,TokenUsage.total_tokens).
UseNotBlankStrfromcore.typesfor all identifier/name fields in Pydantic models, including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants, instead of manual whitespace validators.
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task.
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly, never silently swallow exceptions.
Validate at system boundaries: user input, external APIs, config files.
NEVER useimport logging,logging.getLogger(), orprint()in application code.
Always useloggeras the variable name for loggers (not_logger, notlog).
Event names must always use constants from domain-specific modules underai_company.observability.events(e.g.,PROVIDER_CALL_STARTfromevents.provider,BUDGET_RECORD_ADDEDfromevents.budget). Import dire...
Files:
src/ai_company/engine/agent_engine.pysrc/ai_company/engine/approval_gate.pysrc/ai_company/observability/events/approval_gate.pysrc/ai_company/tools/approval_tool.py
src/ai_company/**/[!_]*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Every module with business logic MUST have:
from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Files:
src/ai_company/engine/agent_engine.pysrc/ai_company/engine/approval_gate.pysrc/ai_company/observability/events/approval_gate.pysrc/ai_company/tools/approval_tool.py
🧠 Learnings (10)
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/ai_company/**/[!_]*.py : Every module with business logic MUST have: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Event names must always use constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
CLAUDE.mdsrc/ai_company/engine/agent_engine.pysrc/ai_company/observability/events/approval_gate.py
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Structured logging: always use `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : All state transitions must log at INFO level.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : NEVER use `import logging`, `logging.getLogger()`, or `print()` in application code.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Always use `logger` as the variable name for loggers (not `_logger`, not `log`).
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : DEBUG logging for object creation, internal flow, and entry/exit of key functions.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Pure data models, enums, and re-exports do NOT need logging.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Functions must be less than 50 lines; files must be less than 800 lines.
Applied to files:
src/ai_company/engine/approval_gate.py
🧬 Code graph analysis (2)
src/ai_company/engine/agent_engine.py (7)
src/ai_company/engine/_security_factory.py (2)
make_security_interceptor(39-117)registry_with_approval_tool(120-148)src/ai_company/engine/_validation.py (1)
validate_agent(61-73)src/ai_company/engine/approval_gate.py (1)
ApprovalGate(40-322)src/ai_company/persistence/repositories.py (1)
ParkedContextRepository(199-267)src/ai_company/engine/loop_protocol.py (1)
ExecutionLoop(151-189)src/ai_company/engine/react_loop.py (1)
ReactLoop(55-269)src/ai_company/tools/permissions.py (2)
ToolPermissionChecker(35-225)from_permissions(115-128)
src/ai_company/tools/approval_tool.py (5)
src/ai_company/core/enums.py (1)
ToolCategory(294-308)src/ai_company/core/validation.py (1)
is_valid_action_type(6-19)src/ai_company/tools/base.py (2)
category(128-130)action_type(133-135)src/ai_company/api/approval_store.py (2)
ApprovalStore(27-181)add(42-59)src/ai_company/memory/errors.py (1)
MemoryError(13-14)
🪛 LanguageTool
CLAUDE.md
[style] ~154-~154: A comma is missing here.
Context: ...nder ai_company.observability.events (e.g. PROVIDER_CALL_START from `events.prov...
(EG_NO_COMMA)
🔇 Additional comments (1)
src/ai_company/observability/events/approval_gate.py (1)
5-19: Approval-gate event constants are clean and well-scoped.Nice coverage of the lifecycle and consistent domain prefixing.
| - **Never** use `import logging` / `logging.getLogger()` / `print()` in application code | ||
| - **Variable name**: always `logger` (not `_logger`, not `log`) | ||
| - **Event names**: always use constants from the domain-specific module under `ai_company.observability.events` (e.g. `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`, `CFO_ANOMALY_DETECTED` from `events.cfo`, `CONFLICT_DETECTED` from `events.conflict`, `MEETING_STARTED` from `events.meeting`, `CLASSIFICATION_START` from `events.classification`, `CONSOLIDATION_START` from `events.consolidation`, `ORG_MEMORY_QUERY_START` from `events.org_memory`, `API_REQUEST_STARTED` from `events.api`, `API_ROUTE_NOT_FOUND` from `events.api`, `CODE_RUNNER_EXECUTE_START` from `events.code_runner`, `DOCKER_EXECUTE_START` from `events.docker`, `MCP_INVOKE_START` from `events.mcp`, `SECURITY_EVALUATE_START` from `events.security`, `HR_HIRING_REQUEST_CREATED` from `events.hr`, `PERF_METRIC_RECORDED` from `events.performance`, `TRUST_EVALUATE_START` from `events.trust`, `PROMOTION_EVALUATE_START` from `events.promotion`, `PROMPT_BUILD_START` from `events.prompt`, `MEMORY_RETRIEVAL_START` from `events.memory`, `MEMORY_BACKEND_CONNECTED` from `events.memory`, `MEMORY_ENTRY_STORED` from `events.memory`, `MEMORY_BACKEND_SYSTEM_ERROR` from `events.memory`, `AUTONOMY_ACTION_AUTO_APPROVED` from `events.autonomy`, `TIMEOUT_POLICY_EVALUATED` from `events.timeout`, `PERSISTENCE_AUDIT_ENTRY_SAVED` from `events.persistence`, `TASK_ENGINE_STARTED` from `events.task_engine`, `COORDINATION_STARTED` from `events.coordination`, `COMMUNICATION_DISPATCH_START` from `events.communication`, `COMPANY_STARTED` from `events.company`, `CONFIG_LOADED` from `events.config`, `CORRELATION_ID_CREATED` from `events.correlation`, `DECOMPOSITION_STARTED` from `events.decomposition`, `DELEGATION_STARTED` from `events.delegation`, `EXECUTION_LOOP_STARTED` from `events.execution`, `GIT_OPERATION_START` from `events.git`, `PARALLEL_EXECUTION_STARTED` from `events.parallel`, `PERSONALITY_LOADED` from `events.personality`, `QUOTA_CHECKED` from `events.quota`, `ROLE_ASSIGNED` from `events.role`, `ROUTING_STARTED` from `events.routing`, `SANDBOX_EXECUTE_START` from `events.sandbox`, `TASK_CREATED` from `events.task`, `TASK_ASSIGNMENT_STARTED` from `events.task_assignment`, `TASK_ROUTING_STARTED` from `events.task_routing`, `TEMPLATE_LOADED` from `events.template`, `TOOL_INVOKE_START` from `events.tool`, `WORKSPACE_CREATED` from `events.workspace`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT` | ||
| - **Event names**: always use constants from the domain-specific module under `ai_company.observability.events` (e.g. `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`, `CFO_ANOMALY_DETECTED` from `events.cfo`, `CONFLICT_DETECTED` from `events.conflict`, `MEETING_STARTED` from `events.meeting`, `CLASSIFICATION_START` from `events.classification`, `CONSOLIDATION_START` from `events.consolidation`, `ORG_MEMORY_QUERY_START` from `events.org_memory`, `API_REQUEST_STARTED` from `events.api`, `API_ROUTE_NOT_FOUND` from `events.api`, `CODE_RUNNER_EXECUTE_START` from `events.code_runner`, `DOCKER_EXECUTE_START` from `events.docker`, `MCP_INVOKE_START` from `events.mcp`, `SECURITY_EVALUATE_START` from `events.security`, `HR_HIRING_REQUEST_CREATED` from `events.hr`, `PERF_METRIC_RECORDED` from `events.performance`, `TRUST_EVALUATE_START` from `events.trust`, `PROMOTION_EVALUATE_START` from `events.promotion`, `PROMPT_BUILD_START` from `events.prompt`, `MEMORY_RETRIEVAL_START` from `events.memory`, `MEMORY_BACKEND_CONNECTED` from `events.memory`, `MEMORY_ENTRY_STORED` from `events.memory`, `MEMORY_BACKEND_SYSTEM_ERROR` from `events.memory`, `AUTONOMY_ACTION_AUTO_APPROVED` from `events.autonomy`, `TIMEOUT_POLICY_EVALUATED` from `events.timeout`, `PERSISTENCE_AUDIT_ENTRY_SAVED` from `events.persistence`, `TASK_ENGINE_STARTED` from `events.task_engine`, `COORDINATION_STARTED` from `events.coordination`, `COMMUNICATION_DISPATCH_START` from `events.communication`, `COMPANY_STARTED` from `events.company`, `CONFIG_LOADED` from `events.config`, `CORRELATION_ID_CREATED` from `events.correlation`, `DECOMPOSITION_STARTED` from `events.decomposition`, `DELEGATION_STARTED` from `events.delegation`, `EXECUTION_LOOP_STARTED` from `events.execution`, `GIT_OPERATION_START` from `events.git`, `PARALLEL_EXECUTION_STARTED` from `events.parallel`, `PERSONALITY_LOADED` from `events.personality`, `QUOTA_CHECKED` from `events.quota`, `ROLE_ASSIGNED` from `events.role`, `ROUTING_STARTED` from `events.routing`, `SANDBOX_EXECUTE_START` from `events.sandbox`, `TASK_CREATED` from `events.task`, `TASK_ASSIGNMENT_STARTED` from `events.task_assignment`, `TASK_ROUTING_STARTED` from `events.task_routing`, `TEMPLATE_LOADED` from `events.template`, `TOOL_INVOKE_START` from `events.tool`, `WORKSPACE_CREATED` from `events.workspace`, `APPROVAL_GATE_ESCALATION_DETECTED` from `events.approval_gate`, `APPROVAL_GATE_ESCALATION_FAILED` from `events.approval_gate`, `APPROVAL_GATE_INITIALIZED` from `events.approval_gate`, `APPROVAL_GATE_RISK_CLASSIFIED` from `events.approval_gate`, `APPROVAL_GATE_RISK_CLASSIFY_FAILED` from `events.approval_gate`, `APPROVAL_GATE_CONTEXT_PARKED` from `events.approval_gate`, `APPROVAL_GATE_CONTEXT_PARK_FAILED` from `events.approval_gate`, `APPROVAL_GATE_PARK_TASKLESS` from `events.approval_gate`, `APPROVAL_GATE_RESUME_STARTED` from `events.approval_gate`, `APPROVAL_GATE_CONTEXT_RESUMED` from `events.approval_gate`, `APPROVAL_GATE_RESUME_FAILED` from `events.approval_gate`, `APPROVAL_GATE_RESUME_DELETE_FAILED` from `events.approval_gate`, `APPROVAL_GATE_RESUME_TRIGGERED` from `events.approval_gate`, `APPROVAL_GATE_NO_PARKED_CONTEXT` from `events.approval_gate`, `APPROVAL_GATE_LOOP_WIRING_WARNING` from `events.approval_gate`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT` |
There was a problem hiding this comment.
Fix punctuation in the long event-list sentence.
There’s a missing comma in this line, and the sentence becomes hard to parse. Please add the comma to keep the examples readable.
🧰 Tools
🪛 LanguageTool
[style] ~154-~154: A comma is missing here.
Context: ...nder ai_company.observability.events (e.g. PROVIDER_CALL_START from `events.prov...
(EG_NO_COMMA)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@CLAUDE.md` at line 154, The long example list of event constants is missing a
separating comma near the end; update the sentence so every listed constant is
followed by a comma (specifically add a comma after
`APPROVAL_GATE_LOOP_WIRING_WARNING` from `events.approval_gate` and ensure other
entries like `APPROVAL_GATE_NO_PARKED_CONTEXT`,
`APPROVAL_GATE_LOOP_WIRING_WARNING` are comma-separated) so the examples remain
readable and properly punctuated in the line that imports EVENT_CONSTANTs from
ai_company.observability.events domains.
| @@ -237,7 +253,7 @@ class ApproveRequest(BaseModel): | |||
|
|
|||
There was a problem hiding this comment.
RejectRequest.reason missing max_length constraint
ApproveRequest.comment received a max_length=4096 bound in this PR, but RejectRequest.reason (which serves the same role — a user-supplied decision explanation injected into the resume message via build_resume_message) was not updated.
Since _signal_resume_intent passes decision_reason=data.reason to the gate, and the future scheduler will call ApprovalGate.build_resume_message(decision_reason=...) to build the prompt-injected message, an unbounded reason string can produce an arbitrarily large system message. Consistent with the approve path:
| reason: NotBlankStr | None = Field(default=None, max_length=4096) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/api/dto.py
Line: 253
Comment:
**`RejectRequest.reason` missing `max_length` constraint**
`ApproveRequest.comment` received a `max_length=4096` bound in this PR, but `RejectRequest.reason` (which serves the same role — a user-supplied decision explanation injected into the resume message via `build_resume_message`) was not updated.
Since `_signal_resume_intent` passes `decision_reason=data.reason` to the gate, and the future scheduler will call `ApprovalGate.build_resume_message(decision_reason=...)` to build the prompt-injected message, an unbounded `reason` string can produce an arbitrarily large system message. Consistent with the approve path:
```suggestion
reason: NotBlankStr | None = Field(default=None, max_length=4096)
```
How can I resolve this? If you propose a fix, please make it concise.- Fix save_if_pending() docstring to cover all None cases (not-found, expired, already-decided) - Fix approve/reject ConflictError message: "no longer pending (already decided or expired)" instead of "decided by another request" - Make V6 migration crash-safe: rename old→backup before replacing, explicit column list in INSERT to avoid schema-order coupling - Add V6 migration tests: verify task_id nullable via PRAGMA, verify data preservation across migration
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/persistence/test_migrations_v2.py (1)
20-26: 🧹 Nitpick | 🔵 TrivialConsider using shared fixture from conftest.
The
memory_dbfixture is duplicated fromtests/unit/persistence/sqlite/conftest.py. If this test file is moved or the fixture is needed elsewhere, consider either:
- Moving this test to
tests/unit/persistence/sqlite/to use the shared fixture- Promoting the fixture to a higher-level
conftest.py🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/persistence/test_migrations_v2.py` around lines 20 - 26, The memory_db fixture is duplicated; instead of defining async def memory_db in tests/unit/persistence/test_migrations_v2.py, remove this local fixture and use the shared fixture from the existing conftest (or move this test file into tests/unit/persistence/sqlite/ so it picks up the shared memory_db). Update any references to the fixture name memory_db in the test to rely on the conftest-provided fixture, or if you choose promotion, move the memory_db definition into a higher-level conftest.py so all tests can import it implicitly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/ai_company/api/controllers/approvals.py`:
- Line 430: The return currently uses updated; change it to return the saved
value for defensive consistency: in the approve handler replace "return
ApiResponse(data=updated)" with "return ApiResponse(data=saved)" (saved is the
result of save_if_pending), and make the analogous change in the reject handler
(line where it returns updated around the reject function) so both handlers
return the store-backed "saved" object instead of the earlier "updated"
variable.
---
Outside diff comments:
In `@tests/unit/persistence/test_migrations_v2.py`:
- Around line 20-26: The memory_db fixture is duplicated; instead of defining
async def memory_db in tests/unit/persistence/test_migrations_v2.py, remove this
local fixture and use the shared fixture from the existing conftest (or move
this test file into tests/unit/persistence/sqlite/ so it picks up the shared
memory_db). Update any references to the fixture name memory_db in the test to
rely on the conftest-provided fixture, or if you choose promotion, move the
memory_db definition into a higher-level conftest.py so all tests can import it
implicitly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 161c203d-0e8b-4d82-8180-edf1ed71d802
📒 Files selected for processing (4)
src/ai_company/api/approval_store.pysrc/ai_company/api/controllers/approvals.pysrc/ai_company/persistence/sqlite/migrations.pytests/unit/persistence/test_migrations_v2.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Agent
- GitHub Check: Build Backend
- GitHub Check: Build Web
- GitHub Check: Greptile Review
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Python version 3.14+ with PEP 649 native lazy annotations required.
Do NOT usefrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax:except A, B:(no parentheses) — ruff enforces this on Python 3.14.
Line length must be 88 characters, enforced by ruff.
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:example-provider,example-large-001,example-medium-001,example-small-001. Tests must usetest-provider,test-small-001, etc.
Files:
src/ai_company/api/approval_store.pysrc/ai_company/persistence/sqlite/migrations.pysrc/ai_company/api/controllers/approvals.pytests/unit/persistence/test_migrations_v2.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: All public functions must have type hints. Enforce via mypy strict mode.
Google-style docstrings required on public classes and functions, enforced by ruff D rules.
Create new objects instead of mutating existing ones. For non-Pydantic internal collections, usecopy.deepcopy()at construction plusMappingProxyTypewrapping for read-only enforcement.
Fordict/listfields in frozen Pydantic models, rely onfrozen=Truefor field reassignment prevention and usecopy.deepcopy()at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 withBaseModel,model_validator,computed_field,ConfigDict. Use@computed_fieldfor derived values instead of storing redundant fields (e.g.,TokenUsage.total_tokens).
UseNotBlankStrfromcore.typesfor all identifier/name fields in Pydantic models, including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants, instead of manual whitespace validators.
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task.
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly, never silently swallow exceptions.
Validate at system boundaries: user input, external APIs, config files.
NEVER useimport logging,logging.getLogger(), orprint()in application code.
Always useloggeras the variable name for loggers (not_logger, notlog).
Event names must always use constants from domain-specific modules underai_company.observability.events(e.g.,PROVIDER_CALL_STARTfromevents.provider,BUDGET_RECORD_ADDEDfromevents.budget). Import dire...
Files:
src/ai_company/api/approval_store.pysrc/ai_company/persistence/sqlite/migrations.pysrc/ai_company/api/controllers/approvals.py
src/ai_company/**/[!_]*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Every module with business logic MUST have:
from ai_company.observability import get_loggerthenlogger = get_logger(__name__)
Files:
src/ai_company/api/approval_store.pysrc/ai_company/persistence/sqlite/migrations.pysrc/ai_company/api/controllers/approvals.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Tests must use markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Test coverage minimum: 80% (enforced in CI).
Async tests: useasyncio_mode = "auto"— no manual@pytest.mark.asyncioneeded.
Test timeout: 30 seconds per test.
Prefer@pytest.mark.parametrizefor testing similar cases.
Files:
tests/unit/persistence/test_migrations_v2.py
🧠 Learnings (2)
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : All error paths must log at WARNING or ERROR with context before raising.
Applied to files:
src/ai_company/api/controllers/approvals.py
📚 Learning: 2026-03-13T21:03:58.906Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/ai_company/**/[!_]*.py : Every module with business logic MUST have: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`
Applied to files:
src/ai_company/api/controllers/approvals.py
🧬 Code graph analysis (4)
src/ai_company/api/approval_store.py (4)
src/ai_company/core/approval.py (1)
ApprovalItem(24-96)src/ai_company/core/enums.py (1)
ApprovalStatus(434-440)src/ai_company/api/app.py (1)
_on_expire(74-99)src/ai_company/memory/errors.py (1)
MemoryError(13-14)
src/ai_company/persistence/sqlite/migrations.py (2)
tests/unit/persistence/sqlite/test_user_repo.py (1)
db(22-28)tests/unit/hr/test_persistence.py (1)
db(29-35)
src/ai_company/api/controllers/approvals.py (4)
src/ai_company/api/state.py (1)
approval_gate(139-141)src/ai_company/memory/errors.py (1)
MemoryError(13-14)src/ai_company/api/errors.py (2)
UnauthorizedError(59-65)ConflictError(41-47)src/ai_company/api/auth/models.py (1)
AuthenticatedUser(68-87)
tests/unit/persistence/test_migrations_v2.py (2)
src/ai_company/persistence/sqlite/migrations.py (3)
get_user_version(257-261)set_user_version(264-283)run_migrations(334-402)tests/unit/persistence/sqlite/conftest.py (1)
memory_db(15-22)
🔇 Additional comments (15)
src/ai_company/persistence/sqlite/migrations.py (3)
224-252: V6 migration is not idempotent and crashes on partial failure recovery.The past review comment correctly identified that this migration breaks if interrupted between steps. Specifically:
- If
parked_contexts_newis created and populated, butparked_contextsrename fails, the next run will fail at line 247 (ALTER TABLE parked_contexts RENAME TO parked_contexts_old) becauseparked_contexts_newalready exists.- If
parked_contextswas renamed toparked_contexts_oldbut the subsequent rename ofparked_contexts_newtoparked_contextsfailed, the INSERT at lines 238-246 will fail becauseparked_contextsno longer exists.The migration should check table state before executing and handle partial states.
[raise_major_issue, duplicate_comment]
🛠️ Proposed fix for crash-safe v6 migration
async def _apply_v6(db: aiosqlite.Connection) -> None: """Apply schema v6: make parked_contexts.task_id nullable.""" - for stmt in _V6_STATEMENTS: - await db.execute(stmt) + # Check current table state for crash recovery + cursor = await db.execute( + "SELECT name FROM sqlite_master WHERE type='table' " + "AND name IN ('parked_contexts', 'parked_contexts_new', 'parked_contexts_old')" + ) + existing = {row[0] for row in await cursor.fetchall()} + + # Case 1: Already migrated (parked_contexts exists, no temp tables) + if "parked_contexts" in existing and "parked_contexts_new" not in existing: + # Just ensure indexes exist + await db.execute( + "CREATE INDEX IF NOT EXISTS idx_pc_agent_id ON parked_contexts(agent_id)" + ) + await db.execute( + "CREATE INDEX IF NOT EXISTS idx_pc_approval_id ON parked_contexts(approval_id)" + ) + return + + # Case 2: Interrupted after rename to _old but before _new rename + if "parked_contexts_old" in existing and "parked_contexts_new" in existing: + await db.execute("ALTER TABLE parked_contexts_new RENAME TO parked_contexts") + await db.execute("DROP TABLE IF EXISTS parked_contexts_old") + await db.execute( + "CREATE INDEX IF NOT EXISTS idx_pc_agent_id ON parked_contexts(agent_id)" + ) + await db.execute( + "CREATE INDEX IF NOT EXISTS idx_pc_approval_id ON parked_contexts(approval_id)" + ) + return + + # Case 3: Normal migration path + for stmt in _V6_STATEMENTS: + await db.execute(stmt)
148-163: LGTM!The
_V3_STATEMENTScorrectly definestask_id TEXT(nullable) for fresh database installs. This is consistent with the_V6_STATEMENTSschema, ensuring fresh installs and migrated databases have identical schemas.
318-331: Migration registration is correct.The
_apply_v6function and its registration in_MIGRATIONSfollows the established pattern. The crash-safety concern was addressed in the previous comment.tests/unit/persistence/test_migrations_v2.py (4)
31-32: LGTM!The schema version assertion correctly updated to 6.
53-69: LGTM!The version assertion correctly updated to expect the final schema version 6 after full migration.
98-129: Good test addressing prior review concern.This test properly verifies the v6 migration makes
task_idnullable usingPRAGMA table_info. Consider adding assertions to verify the indexes (idx_pc_agent_id,idx_pc_approval_id) are recreated after the table swap.[approve_code_changes, suggest_optional_refactor]
💡 Optional: Add index verification
# Verify task_id is now nullable cursor = await memory_db.execute("PRAGMA table_info('parked_contexts')") cols = {row[1]: row[3] for row in await cursor.fetchall()} assert cols["task_id"] == 0 # notnull=0 + + # Verify indexes are recreated + cursor = await memory_db.execute( + "SELECT name FROM sqlite_master WHERE type='index' " + "AND tbl_name='parked_contexts'" + ) + indexes = {row[0] for row in await cursor.fetchall()} + assert "idx_pc_agent_id" in indexes + assert "idx_pc_approval_id" in indexes
131-173: LGTM!This test properly verifies that existing data is preserved during the v6 migration. The test setup, migration execution, and data verification are all correct.
src/ai_company/api/approval_store.py (2)
125-149: LGTM — well-designed conditional update with clear semantics.The method correctly applies lazy expiration before the status check, and the absence of
awaitbetween check and save ensures atomicity within the single-threaded event loop. The docstring accurately documents all threeNone-return scenarios.
177-186: LGTM — robust callback error handling.Re-raising
MemoryErrorandRecursionErrorensures critical system errors propagate, while logging other exceptions with full traceback vialogger.exceptionprovides observability without crashing the expiration flow.src/ai_company/api/controllers/approvals.py (6)
101-109: LGTM — exception handling with proper critical error propagation.The PEP 758 except syntax is correctly applied, and re-raising
MemoryError/RecursionErrorensures critical system errors aren't swallowed.
147-152: LGTM — auth failure now properly logged.The unauthorized path now logs
API_AUTH_FAILEDwith context before raising, satisfying the error-path observability requirement.
177-213: Intentional signalling stub — acknowledged deferral to future scheduler.The docstring and
.. todo::clearly document that_signal_resume_intentis a signalling-only stub. The actual resume logic is deferred to a future scheduling component (§12.4) that will observe status changes and callApprovalGate.resume_context(). This matches the PR objectives describing the separation of concerns between the approval controller and the scheduling layer.
306-329: LGTM — auth binding hardens against prompt injection.Binding
requested_byto the authenticated user's username (rather than accepting it from the request body) mitigates potential prompt-injection vectors and ensures audit integrity.
402-410: LGTM — proper optimistic concurrency control.The
save_if_pendingcall provides an atomic check-and-save, correctly handling race conditions where the approval status could change between the initial fetch and the update. The error message clearly indicates both "already decided" and "expired" scenarios.
483-510: LGTM — reject flow consistent with approve pattern.The rejection flow correctly mirrors the approval pattern with
save_if_pending, proper conflict handling, and resume intent signalling with the decision reason.
| decision_reason=data.comment, | ||
| ) | ||
|
|
||
| return ApiResponse(data=updated) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Minor: consider returning saved for defensive consistency.
When save_if_pending succeeds, saved equals updated, so this is correct. However, returning saved would be slightly more defensive and signal that the returned value comes from the store.
♻️ Optional: use saved in return
- return ApiResponse(data=updated)
+ return ApiResponse(data=saved)Same applies to line 511 in reject.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/api/controllers/approvals.py` at line 430, The return
currently uses updated; change it to return the saved value for defensive
consistency: in the approve handler replace "return ApiResponse(data=updated)"
with "return ApiResponse(data=saved)" (saved is the result of save_if_pending),
and make the analogous change in the reject handler (line where it returns
updated around the reject function) so both handlers return the store-backed
"saved" object instead of the earlier "updated" variable.
There was a problem hiding this comment.
Pull request overview
Adds an end-to-end approval workflow gate to the engine runtime so SecOps ESCALATE verdicts and request_human_approval tool calls can park execution and later resume safely, with supporting API hardening, persistence changes, and frontend WS fixes.
Changes:
- Introduces
ApprovalGate+ models, integrates parking checks into loop helpers and both execution loops, and tracks escalations inToolInvoker. - Adds
RequestHumanApprovalTool, DTO validation hardening (action_type format, TTL/comment bounds), controller concurrency guard (save_if_pending), and WS publish resilience. - Updates SQLite schema/migrations for taskless parking and fixes dashboard approval WS handling + tests.
Reviewed changes
Copilot reviewed 39 out of 39 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| web/src/stores/approvals.ts | Refetches full approval items based on approval_id from WS payloads; reconciles filtered views. |
| web/src/tests/stores/approvals.test.ts | Adds async WS handler tests covering refetch, filtering, and error paths. |
| tests/unit/tools/test_invoker_escalation.py | Verifies ToolInvoker.pending_escalations for SecOps ESCALATE and parking metadata. |
| tests/unit/tools/test_approval_tool.py | Validates request_human_approval tool behavior, validation, and risk classification. |
| tests/unit/persistence/test_migrations_v2.py | Updates schema version expectations and adds v6 migration coverage for nullable task_id. |
| tests/unit/observability/test_events.py | Updates event discovery expectations to include approval_gate. |
| tests/unit/engine/test_security_factory.py | Covers new _security_factory helpers and registry approval-tool wiring. |
| tests/unit/engine/test_loop_helpers_approval.py | Ensures loop helper parks and returns PARKED / errors correctly on parking failures. |
| tests/unit/engine/test_approval_gate_models.py | Tests Pydantic frozen models for escalation + resume payloads. |
| tests/unit/engine/test_approval_gate.py | Tests parking, persistence, resume + resume-message injection mitigation behavior. |
| tests/unit/core/test_validation.py | Adds unit tests for shared is_valid_action_type(). |
| tests/unit/api/test_dto.py | Updates tests for DTO constraints (action_type, TTL bounds, approve comment length). |
| tests/unit/api/test_approval_store.py | Adds tests for save_if_pending, filters, and on-expire callback handling. |
| tests/unit/api/controllers/test_approvals_helpers.py | Covers controller helper functions for decision validation + WS publishing behavior. |
| tests/unit/api/controllers/test_approvals.py | Updates action_type payload to match new category:action requirements. |
| src/ai_company/tools/registry.py | Adds all_tools() helper used by approval-tool registry augmentation. |
| src/ai_company/tools/invoker.py | Tracks pending escalations from SecOps verdicts and tool parking metadata; batch ordering made deterministic. |
| src/ai_company/tools/approval_tool.py | Implements RequestHumanApprovalTool creating ApprovalItems and returning parking metadata. |
| src/ai_company/tools/init.py | Exports RequestHumanApprovalTool. |
| src/ai_company/security/timeout/parked_context.py | Allows task_id=None for taskless agents. |
| src/ai_company/security/timeout/park_service.py | Updates park API to accept optional task_id. |
| src/ai_company/persistence/sqlite/migrations.py | Bumps schema to v6 and migrates parked_contexts task_id to nullable. |
| src/ai_company/observability/events/approval_gate.py | Adds structured event constants for approval-gate lifecycle. |
| src/ai_company/engine/react_loop.py | Wires approval_gate into tool execution path via loop helpers. |
| src/ai_company/engine/plan_execute_loop.py | Wires approval_gate into per-step tool execution path. |
| src/ai_company/engine/loop_helpers.py | Adds escalation parking logic and error reporting during parking failures. |
| src/ai_company/engine/approval_gate_models.py | Defines frozen EscalationInfo and ResumePayload. |
| src/ai_company/engine/approval_gate.py | Implements context park/persist/resume + safe resume message builder. |
| src/ai_company/engine/agent_engine.py | Builds/wires default ApprovalGate and injects approval tool into ToolRegistry. |
| src/ai_company/engine/_security_factory.py | Extracts security/tool factory logic and adds approval-tool registry augmentation. |
| src/ai_company/engine/init.py | Re-exports ApprovalGate and gate models. |
| src/ai_company/core/validation.py | Adds shared category:action format validator. |
| src/ai_company/api/state.py | Adds approval_gate to application state. |
| src/ai_company/api/dto.py | Hardens approval DTOs: action_type validator, TTL max, approve comment length. |
| src/ai_company/api/controllers/approvals.py | Binds requested_by to auth user, uses save_if_pending, improves WS error handling, logs resume intent. |
| src/ai_company/api/approval_store.py | Adds save_if_pending() and hardens on-expire callback error handling. |
| docs/design/engine.md | Documents new parking flow in engine lifecycle. |
| README.md | Updates status to reflect approval gates are implemented. |
| CLAUDE.md | Updates repo layout/docs for approval gate + approval tool + event constants. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| task_id=task_id, | ||
| ) | ||
| existing = list(tool_registry.all_tools()) | ||
| return _ToolRegistry([*existing, approval_tool]) |
| if item.status != ApprovalStatus.PENDING: | ||
| msg = f"Approval {approval_id!r} is {item.status.value}, not pending" | ||
| logger.warning( | ||
| API_APPROVAL_CONFLICT, | ||
| approval_id=approval_id, | ||
| current_status=item.status.value, | ||
| ) | ||
| raise ConflictError(msg) | ||
|
|
||
| auth_user = request.scope.get("user") | ||
| if not isinstance(auth_user, AuthenticatedUser): | ||
| msg = "Authentication required" | ||
| logger.warning( | ||
| API_AUTH_FAILED, | ||
| approval_id=approval_id, | ||
| note="No authenticated user in request scope", | ||
| ) | ||
| raise UnauthorizedError(msg) |
- Rename _trigger_resume → _signal_resume_intent with TODO for scheduler - Add warning log before UnauthorizedError in controller auth checks - Add DEBUG log on successful parked record deletion - New APPROVAL_GATE_PARK_TASKLESS event for pre-park debug log - New APPROVAL_GATE_RISK_CLASSIFY_FAILED event for classification failure - Fix SQLite schema: parked_contexts.task_id TEXT NOT NULL → TEXT (nullable) - Extract shared is_valid_action_type() to core/validation.py - Fix filtered WS handler: re-fetch query instead of blind total increment - Add afterEach(vi.restoreAllMocks()) for spy cleanup in frontend tests - Add 10 missing event constants to CLAUDE.md reference list - Fix test-model-001 → test-small-001 per project conventions - Add comprehensive tests: validation, approval_store, controller helpers, security_factory, approval_gate init/resume, risk classification failure, action_type format validation, TTL bounds, ApproveRequest comment length, loop_helpers taskless path and IOError handling
- Split park_context/resume_context into smaller helpers (<50 lines each) - Fix duplicate APPROVAL_GATE_CONTEXT_RESUMED emission on success path - Add V6 migration to make parked_contexts.task_id nullable on existing DBs - Warn when external execution_loop bypasses approval gate wiring - Guard execute() against KeyError on missing arguments - Use APPROVAL_GATE_RISK_CLASSIFY_FAILED for "no classifier" debug log - Fix TOCTOU race on WS duplicate prevention with post-fetch re-check
- Rename _trigger_resume → _signal_resume_intent with TODO for scheduler - Add warning log before UnauthorizedError in controller auth checks - Add DEBUG log on successful parked record deletion - New APPROVAL_GATE_PARK_TASKLESS event for pre-park debug log - New APPROVAL_GATE_RISK_CLASSIFY_FAILED event for classification failure - Fix SQLite schema: parked_contexts.task_id TEXT NOT NULL → TEXT (nullable) - Extract shared is_valid_action_type() to core/validation.py - Fix filtered WS handler: re-fetch query instead of blind total increment - Add afterEach(vi.restoreAllMocks()) for spy cleanup in frontend tests - Add 10 missing event constants to CLAUDE.md reference list - Fix test-model-001 → test-small-001 per project conventions - Add comprehensive tests: validation, approval_store, controller helpers, security_factory, approval_gate init/resume, risk classification failure, action_type format validation, TTL bounds, ApproveRequest comment length, loop_helpers taskless path and IOError handling
- Split park_context/resume_context into smaller helpers (<50 lines each) - Fix duplicate APPROVAL_GATE_CONTEXT_RESUMED emission on success path - Add V6 migration to make parked_contexts.task_id nullable on existing DBs - Warn when external execution_loop bypasses approval gate wiring - Guard execute() against KeyError on missing arguments - Use APPROVAL_GATE_RISK_CLASSIFY_FAILED for "no classifier" debug log - Fix TOCTOU race on WS duplicate prevention with post-fetch re-check
Summary
request_human_approvaltool calls into the engine execution loop, parking agent context when human approval is requiredRequestHumanApprovalTool: Agent-callable tool that createsApprovalItemin the approval store and signals execution parking via metadataApprovalGateservice: Coordinates context serialization (viaParkService), persistence (viaParkedContextRepository), and resume with decision message injectionReactLoopandPlanExecuteLoopcheck for escalations after tool execution and returnPARKEDtermination reasonToolInvokerescalation tracking: Detects ESCALATE verdicts and parking metadata, exposespending_escalationswith propertuple[EscalationInfo, ...]typingrequested_bybound to auth user (not request body), prompt injection mitigation in resume messages, broadened WebSocket error handlingapproval_idkey from backend payload and fetches full items via APICloses #258, closes #259
Review coverage
Pre-reviewed by 15 specialized agents (docs-consistency, code-reviewer, python-reviewer, test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, resilience-audit, conventions-enforcer, security-reviewer, api-contract-drift, test-quality-reviewer, async-concurrency-reviewer, issue-resolution-verifier).
43 findings identified and addressed across Critical (14), Major (23), and Medium (6) severities. Key fixes:
tuple[Any, ...]→tuple[EscalationInfo, ...]Test plan
uv run ruff check src/ tests/— cleanuv run mypy src/ tests/— clean (926 files, strict)uv run pytest tests/ -n auto --cov=ai_company --cov-fail-under=80— 7540 passed, 94.60% coverage