fix: harden subprocess cleanup, env filtering, and shutdown resilience by Aureliolo · Pull Request #155 · Aureliolo/synthorg

Aureliolo · 2026-03-07T13:10:02Z

Summary

Extract shared subprocess transport cleanup (_process_cleanup.py) — eliminates duplication between _git_base.py and subprocess_sandbox.py, with guarded exception handling to never mask primary results
Harden stderr handling in git timeout path — sanitize control characters and truncate to 500 chars before including in error messages (defense in depth)
Add field_validator for extra_safe_path_prefixes config — reject empty strings and relative paths at the Pydantic validation boundary
Improve shutdown resilience — extract _log_post_cancel_exceptions() to reduce complexity, log post-cancel task exceptions at DEBUG instead of silently discarding, add stderr.write fallback for Windows signal handler when no event loop is available
Log killpg OSError in sandbox kill fallback instead of silently suppressing
Fix inaccurate comment in list_directory.py about total count semantics
Update DESIGN_SPEC.md §11.1.1 (flag injection scope) and §11.1.2 (sandbox features) for accuracy
Replace inline logging string with proper event constant in _process_cleanup.py
Update review skills — default behavior is now "fix everything valid" (including pre-existing issues)

Test plan

2457 tests pass, 5 skipped (Windows symlink + Unix-only)
96.11% coverage (≥80% gate)
New tests: test_timeout_includes_stderr_fragment, test_timeout_sanitizes_stderr_control_chars
New tests: TestExtraSafePathPrefixes (5 tests — inclusion, filtering, defaults, empty string rejection, relative path rejection)
ruff lint + format clean
mypy strict clean (279 files)
All pre-commit hooks pass

Review coverage

Pre-reviewed by 10 agents: code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, resilience-audit, security-reviewer, docs-consistency. 15 findings triaged and addressed (11 from PR changes + 4 pre-existing).

🤖 Generated with Claude Code

CI fixes: - Remove unused type: ignore on Linux (use dual suppression for cross-platform compat) in subprocess_sandbox.py - Fix test assertion for raise-from-build_exc chaining in test_agent_engine_errors.py Review feedback (Greptile): - Rename _check_ref → _check_git_arg with updated docstring (was misleadingly named when used for author/date validation) - Fix missing space in clone URL error message - Add EXECUTION_SHUTDOWN_CLEANUP_FAILED event constant (was reusing EXECUTION_SHUTDOWN_CLEANUP for both start and failure) - Capture partial stderr on timeout in direct subprocess path (mirrors sandbox behavior for better debuggability) - Make Windows Git PATH prefixes configurable via SubprocessSandboxConfig.extra_safe_path_prefixes - Add clarifying comment for total_entries semantics when truncated - Fix ResourceWarning on Windows by explicitly closing subprocess transports after communicate() Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Document Python 3.14+, uv, and Git 2.x+ as system-level prerequisites. Git is a runtime dependency for the built-in git tools (subprocess-based, not managed by pip/uv). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Pre-reviewed by 10 agents, 15 findings addressed: - Extract shared subprocess transport cleanup utility (_process_cleanup.py) - Add stderr sanitization (control char stripping + truncation) to git timeout path - Add field_validator for extra_safe_path_prefixes (reject empty/relative paths) - Log killpg OSError in sandbox instead of silently suppressing - Extract _log_post_cancel_exceptions in shutdown to reduce complexity - Add stderr.write fallback in Windows signal handler (no event loop case) - Update DESIGN_SPEC.md §11.1.1 and §11.1.2 for accuracy - Fix inaccurate comment in list_directory.py - Replace inline logging string with event constant - Add tests for timeout stderr, extra_safe_path_prefixes validation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-03-07T13:10:11Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

coderabbitai · 2026-03-07T13:10:16Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ce7ba261-9711-4f09-9f33-fd9fea2fcfd2

📥 Commits

Reviewing files that changed from the base of the PR and between 9a5a868 and 940068d.

📒 Files selected for processing (1)

src/ai_company/tools/_git_base.py

📝 Walkthrough

Summary by CodeRabbit

New Features
- Configurable extra safe path prefixes for sandbox environments.
Bug Fixes
- Timeouts now include sanitized, truncated subprocess stderr fragments.
- Improved Windows subprocess cleanup, safer subprocess closing, and more robust shutdown fallback behavior.
- Added sandbox kill-fallback observability and safer process cleanup handling.
Documentation
- Added System Requirements and updated Pre-PR review guidance to require addressing all valid findings.
Tests
- New unit tests for shutdown, sandbox prefixes, git timeout handling, listing truncation, and subprocess cleanup.

Walkthrough

Policy docs now mandate fixing all valid findings; adds subprocess transport cleanup utility and observability keys; strengthens git arg validation and stderr sanitization; extends sandbox config with extra safe PATH prefixes and ensures subprocess transport cleanup; updates shutdown logging/handling and associated tests.

Changes

Cohort / File(s)	Summary
Policy & Documentation `/.claude/skills/aurelio-review-pr/SKILL.md`, `/.claude/skills/pre-pr-review/SKILL.md`, `CLAUDE.md`, `DESIGN_SPEC.md`, `README.md`	Reworded review policy to require implementing all valid findings (no deferral/skip); expanded DESIGN_SPEC notes on git arg validation, sandboxing, and Windows transport cleanup; README gains duplicated System Requirements insertion.
Observability Constants `src/ai_company/observability/events/execution.py`, `src/ai_company/observability/events/sandbox.py`, `src/ai_company/observability/events/tool.py`	Added constants: `EXECUTION_SHUTDOWN_CLEANUP_FAILED`, `SANDBOX_KILL_FALLBACK`, and `TOOL_SUBPROCESS_TRANSPORT_CLOSE_FAILED`.
Engine Shutdown `src/ai_company/engine/shutdown.py`	Extracted post-cancel exception logging to `_log_post_cancel_exceptions`; changed cleanup-failure logging to use new event; tightened Unix/Windows signal fallback handling and added last-resort loop.stop()/stderr fallback behavior.
Subprocess Transport Cleanup Utility `src/ai_company/tools/_process_cleanup.py`	New helper `close_subprocess_transport(proc)` that safely closes CPython-internal subprocess transport when present; logs and suppresses close failures.
Git Base & Tools `src/ai_company/tools/_git_base.py`, `src/ai_company/tools/git_tools.py`	Renamed `_check_ref` → `_check_git_arg`; added `_sanitize_stderr()` to strip control chars and cap stderr fragments; ensure direct subprocess paths call `close_subprocess_transport` in finally blocks; updated call sites and timeout messaging.
Sandbox Config & Subprocess Sandbox `src/ai_company/tools/sandbox/config.py`, `src/ai_company/tools/sandbox/subprocess_sandbox.py`	Added `extra_safe_path_prefixes` with validation; made `_get_safe_path_prefixes` an instance method merging extra prefixes; added `_close_process()` delegating to `close_subprocess_transport`; improved `_kill_process()` fallback logging and ensured transport cleanup in `execute`.
File-system list directory `src/ai_company/tools/file_system/list_directory.py`	`_list_sync` now returns `(lines, raw_capped)`; propagated `raw_capped` through `_format_listing_result` and `execute` to centralize truncation metadata.
Tests `tests/unit/engine/test_agent_engine_errors.py`, `tests/unit/engine/test_shutdown.py`, `tests/unit/tools/git/test_git_tools.py`, `tests/unit/tools/sandbox/test_subprocess_sandbox.py`, `tests/unit/tools/test_process_cleanup.py`, `tests/unit/tools/file_system/test_list_directory.py`	Added/updated tests: shutdown post-cancel logging and signal fallbacks; Git timeout stderr sanitization/truncation tests; subprocess cleanup utility tests; sandbox extra_prefix validation and PATH filtering tests; adjusted file-system truncation and fatal-error cause expectations.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as Caller
    participant Sandbox as SubprocessSandbox
    participant Subproc as asyncio.subprocess.Process
    participant Cleanup as close_subprocess_transport
    participant Obs as Observability

    rect rgba(200,200,255,0.5)
    Caller->>Sandbox: execute(command, ...)
    end

    Sandbox->>Subproc: spawn subprocess
    Note right of Subproc: run / communicate / timeout possible

    alt timeout or normal completion
        Sandbox->>Subproc: communicate_with_timeout()
    end

    rect rgba(200,255,200,0.5)
    Sandbox->>Cleanup: _close_process(proc)
    Cleanup->>Subproc: access _transport and close()
    Cleanup->>Obs: on close failure -> emit TOOL_SUBPROCESS_TRANSPORT_CLOSE_FAILED
    end

    alt kill failed (OSError)
        Sandbox->>Obs: emit SANDBOX_KILL_FALLBACK with pid/error
        Sandbox->>Subproc: attempt proc.kill() (suppressed)
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

feat: implement built-in git tools with security hardening #150: Overlaps git tooling internals and subprocess transport cleanup integration (renames, stderr sanitization, process-close calls).
feat: implement subprocess sandbox for tool execution isolation (#131) #153: Overlaps SubprocessSandbox behavior and observability around sandbox kills and subprocess closing (extra_safe_path_prefixes, _close_process usage).
fix: harden subprocess cleanup, env filtering, and shutdown resilience #155: Broadly overlaps the combined changes (process cleanup util, stderr sanitization, sandbox adjustments, shutdown logging) and likely touches the same files/tests.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 68.18% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main focus of the changeset: hardening subprocess cleanup, environment filtering, and shutdown resilience across multiple components.
Description check	✅ Passed	The description is comprehensive and directly related to the changeset, clearly outlining all major changes, test coverage, and verification steps.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/post-merge-ci-and-review-feedback

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch fix/post-merge-ci-and-review-feedback

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-07T13:10:33Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness, security, and maintainability of the system by focusing on subprocess management, error handling, and shutdown procedures. It centralizes subprocess cleanup, improves the sanitization and reporting of errors, and strengthens the sandbox environment's path filtering capabilities. Additionally, it refines the internal review process guidelines to ensure comprehensive issue resolution, contributing to a more stable and secure codebase.

Highlights

Subprocess Cleanup: Extracted shared subprocess transport cleanup logic into a new utility module (_process_cleanup.py) to eliminate duplication and prevent ResourceWarning on Windows, with guarded exception handling.
Error Handling & Resilience: Hardened stderr handling in git timeout paths by sanitizing control characters and truncating output. Improved shutdown resilience by refactoring logging of post-cancel task exceptions and adding a stderr.write fallback for Windows signal handlers.
Sandbox Configuration: Added a field_validator for extra_safe_path_prefixes in the sandbox configuration to reject empty strings and relative paths, enhancing security and configuration robustness.
Logging Improvements: Introduced new logging events for shutdown cleanup failures and sandbox kill fallbacks, and logged killpg OSError instead of silently suppressing it.
Git Tool Hardening: Generalized git argument validation to reject flag-like values for various parameters (not just refs/branches) and updated documentation for accuracy.
Review Process Updates: Updated internal skill documentation (SKILL.md, CLAUDE.md) to clarify that the default behavior for review agents is to implement ALL valid findings, including pre-existing issues, without deferring or skipping.

Changelog

.claude/skills/aurelio-review-pr/SKILL.md
- Updated review skill instructions to emphasize implementing all valid findings, including pre-existing issues, and clarified the 'Implement all (Recommended)' option.
.claude/skills/pre-pr-review/SKILL.md
- Modified pre-PR review skill instructions to explicitly state that all valid findings, including pre-existing issues, should be implemented by default.
CLAUDE.md
- Added a new instruction emphasizing that all valid review findings, including pre-existing issues, should be fixed without deferral or skipping.
DESIGN_SPEC.md
- Updated the description of built-in git tools to reflect broader flag injection prevention for various git arguments.
- Updated the SubprocessSandbox description to include configurable extra_safe_path_prefixes and explicit transport cleanup on Windows.
README.md
- Added a new 'System Requirements' section detailing Python, uv, and Git versions.
src/ai_company/engine/shutdown.py
- Removed unused contextlib import.
- Added EXECUTION_SHUTDOWN_CLEANUP_FAILED event constant.
- Extracted _log_post_cancel_exceptions method to handle and log exceptions from tasks after cancellation.
- Updated cleanup callback error logging to use the new EXECUTION_SHUTDOWN_CLEANUP_FAILED event.
- Implemented a sys.stderr.write fallback for Windows signal handling when no event loop is available, improving shutdown resilience.
src/ai_company/observability/events/execution.py
- Added EXECUTION_SHUTDOWN_CLEANUP_FAILED event constant for logging cleanup failures.
src/ai_company/observability/events/sandbox.py
- Added SANDBOX_KILL_FALLBACK event constant for logging fallback kill attempts.
src/ai_company/observability/events/tool.py
- Added TOOL_SUBPROCESS_TRANSPORT_CLOSE_FAILED event constant for logging subprocess transport closure failures.
src/ai_company/tools/_git_base.py
- Imported close_subprocess_transport utility.
- Defined regex for control characters and a maximum stderr fragment length.
- Added _sanitize_stderr function to strip control characters and truncate stderr output.
- Renamed _check_ref to _check_git_arg and updated its docstring to reflect broader applicability for git arguments.
- Modified _await_git_process to capture and sanitize stderr fragments for inclusion in timeout error messages.
- Ensured close_subprocess_transport is called in a finally block after _run_git_direct to ensure resource cleanup.
src/ai_company/tools/_process_cleanup.py
- Added a new module containing close_subprocess_transport function to explicitly close asyncio subprocess transports, preventing ResourceWarning on Windows.
src/ai_company/tools/file_system/list_directory.py
- Added a clarifying comment about the semantics of the total count in _format_listing_result.
src/ai_company/tools/git_tools.py
- Updated various git tool methods (_build_filter_args, GitLogTool.execute, GitDiffTool.execute, _create_branch, GitBranchTool.execute, GitCloneTool.execute) to use the more general _check_git_arg instead of _check_ref for argument validation.
src/ai_company/tools/sandbox/config.py
- Imported PurePath and field_validator from pydantic.
- Added extra_safe_path_prefixes field to SubprocessSandboxConfig for configurable safe PATH entries.
- Implemented a field_validator for extra_safe_path_prefixes to ensure entries are non-empty and absolute paths.
src/ai_company/tools/sandbox/subprocess_sandbox.py
- Imported new sandbox event SANDBOX_KILL_FALLBACK and close_subprocess_transport utility.
- Modified _get_safe_path_prefixes to incorporate extra_safe_path_prefixes from the sandbox configuration.
- Added logging for OSError in _kill_process when killpg fails, using the SANDBOX_KILL_FALLBACK event.
- Added _close_process static method to delegate subprocess transport cleanup.
- Ensured _close_process is called in a finally block after _communicate_with_timeout to ensure resource cleanup.
tests/unit/engine/test_agent_engine_errors.py
- Updated the assertion in test_handle_fatal_error_secondary_failure_raises_original to check for ValueError and its message in the exception cause.
tests/unit/tools/git/test_git_tools.py
- Added test_timeout_includes_stderr_fragment to verify that timeout messages include sanitized stderr output.
- Added test_timeout_sanitizes_stderr_control_chars to confirm that control characters are stripped from stderr in timeout messages.
tests/unit/tools/sandbox/test_subprocess_sandbox.py
- Added TestExtraSafePathPrefixes class with tests for extra_prefixes_included, extra_prefix_survives_path_filter, default_empty_no_change, rejects_empty_string, and rejects_relative_path.

Activity

The pull request includes 2457 passing tests and 5 skipped tests (Windows symlink + Unix-only).
Code coverage is at 96.11%, meeting the ≥80% gate.
New tests were added for git timeout stderr handling and extra_safe_path_prefixes validation.
Ruff lint and format checks passed, and mypy strict clean was achieved across 279 files.
All pre-commit hooks passed.
The PR underwent pre-review by 10 agents (code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, resilience-audit, security-reviewer, docs-consistency).
15 findings were triaged and addressed, with 11 stemming from PR changes and 4 from pre-existing issues.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces significant hardening and resilience improvements across subprocess handling, shutdown logic, and environment filtering. Key changes include extracting shared subprocess cleanup logic into _process_cleanup.py, hardening stderr handling in the git timeout path, and enhancing shutdown resilience. However, a potential security weakness was identified in the stderr sanitization logic for git timeout messages, which could be exploited for indirect prompt injection due to incomplete stripping of control characters (specifically newlines). Additionally, consider making the new process cleanup utility more explicit about its platform-specific nature.

gemini-code-assist · 2026-03-07T13:12:56Z

src/ai_company/tools/_git_base.py

 )


+_CONTROL_CHAR_RE = re.compile(r"[\x00-\x08\x0b-\x1f\x7f]")


The _sanitize_stderr function uses an incomplete regex for stripping control characters. Specifically, the _CONTROL_CHAR_RE regex [\x00-\x08\x0b-\x1f\x7f] does not include the newline character (\x0a or \n). Since the sanitized stderr_fragment is included in the ToolExecutionResult content (which is fed back to the LLM), an attacker who can control the stderr of a git command (e.g., via a malicious repository or a slow server causing a timeout) could inject newlines followed by a prompt injection payload. This could manipulate the agent's behavior.

Consider updating the regex to include all control characters, including newlines and tabs.

gemini-code-assist · 2026-03-07T13:12:56Z

src/ai_company/tools/_process_cleanup.py

+def close_subprocess_transport(proc: asyncio.subprocess.Process) -> None:
+    """Close subprocess transport to prevent ResourceWarning on Windows.
+
+    On Windows with ProactorEventLoop, pipe transports may not be
+    closed promptly after kill+communicate, causing ResourceWarning
+    at GC time.  Explicitly closing the transport avoids this.
+
+    Uses ``getattr`` to access the CPython-internal ``_transport``
+    attribute — if the attribute is absent (different runtime or
+    future CPython version), this is a no-op.  Exceptions from
+    ``close()`` are logged and suppressed so cleanup never masks
+    the primary result.
+
+    Args:
+        proc: The subprocess whose transport should be closed.
+    """
+    transport = getattr(proc, "_transport", None)
+    if transport is None or transport.is_closing():
+        return
+    try:
+        transport.close()
+    except Exception:
+        logger.debug(
+            TOOL_SUBPROCESS_TRANSPORT_CLOSE_FAILED,
+            exc_info=True,
+        )


While the use of getattr makes this function safe on platforms where _transport doesn't exist, the docstring explicitly mentions this is a workaround for ProactorEventLoop on Windows. To make the intent clearer and avoid any potential side effects on other platforms, consider adding a platform check at the beginning of the function to make it a no-op on non-Windows systems.

def close_subprocess_transport(proc: asyncio.subprocess.Process) -> None: """Close subprocess transport to prevent ResourceWarning on Windows. On Windows with ProactorEventLoop, pipe transports may not be closed promptly after kill+communicate, causing ResourceWarning at GC time. Explicitly closing the transport avoids this. Uses ``getattr`` to access the CPython-internal ``_transport`` attribute — if the attribute is absent (different runtime or future CPython version), this is a no-op. Exceptions from ``close()`` are logged and suppressed so cleanup never masks the primary result. Args: proc: The subprocess whose transport should be closed. """ import os if os.name != "nt": return transport = getattr(proc, "_transport", None) if transport is None or transport.is_closing(): return try: transport.close() except Exception: logger.debug( TOOL_SUBPROCESS_TRANSPORT_CLOSE_FAILED, exc_info=True, )

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/ai_company/tools/_git_base.py (1)

321-346: ⚠️ Potential issue | 🟠 Major

Apply the timeout stderr sanitization to sandboxed git runs too.

Lines 321-346 harden only the direct-subprocess timeout path. When self._sandbox is set, _run_git() returns through _sandbox_result_to_execution_result(), which still surfaces raw result.stderr on timeout, so control characters and oversized stderr bypass this new hardening entirely.

🔧 Suggested change

     def _sandbox_result_to_execution_result(
         args: list[str],
         result: SandboxResult,
         *,
         deadline: float,
@@
         if result.timed_out:
+            stderr_fragment = _sanitize_stderr(result.stderr.strip())
             logger.warning(
                 GIT_COMMAND_TIMEOUT,
                 command=_sanitize_command(["git", *args]),
                 deadline=deadline,
+                stderr_fragment=stderr_fragment,
             )
+            msg = f"Git command timed out after {deadline}s"
+            if stderr_fragment:
+                msg += f": {stderr_fragment}"
             return ToolExecutionResult(
-                content=result.stderr or "Git command timed out",
+                content=msg,
                 is_error=True,
             )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/tools/_git_base.py` around lines 321 - 346, The sandbox path
still returns raw stderr on timeout; update _sandbox_result_to_execution_result
to apply the same stderr sanitization and truncation used in _run_git (use
_sanitize_stderr) before building the ToolExecutionResult so control characters
and oversized stderr are never exposed; specifically, detect timeout cases in
_sandbox_result_to_execution_result (where self._sandbox results are converted),
run _sanitize_stderr(result.stderr.decode(...)) and pass the sanitized fragment
into the returned ToolExecutionResult content/fields (and any logged messages)
instead of the raw result.stderr.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/engine/shutdown.py`:
- Around line 263-266: The try/except around task.exception() currently swallows
errors; replace the silent pass with a debug-level log that records the failure
and the task identity so shutdown diagnostics aren’t lost. Specifically, in the
shutdown helper where you call task.exception() and assign exc, catch Exception
and call the module logger (e.g., logger.debug(...) or
process_logger.debug(...)) including the task variable and the exception info
(exc_info=True or use logger.exception at debug level) so unexpected failures
retrieving task.exception() are recorded.

In `@src/ai_company/tools/_process_cleanup.py`:
- Around line 36-40: The is_closing() check on the transport is unguarded and
can raise if the method is missing; wrap the probe in the same safety as close()
by moving the is_closing() call into the try/except block (or defensively check
callable(getattr(transport, "is_closing", None)) before invoking) so that both
transport.is_closing() and transport.close() are protected from unexpected
exceptions; update references in this block that use proc._transport,
transport.is_closing, and transport.close to ensure any error from either method
is caught and handled consistently.

In `@src/ai_company/tools/file_system/list_directory.py`:
- Around line 234-238: The truncation flag is computed from len(lines) which is
wrong because _list_sync stops after MAX_ENTRIES + 1 raw entries and
_classify_entry may drop some, so len(lines) <= MAX_ENTRIES does not guarantee
completeness; modify the listing logic to use the explicit raw-truncation
indicator from _list_sync (e.g., a returned boolean like raw_truncated or an
extra sentinel entry) to set metadata["truncated"] and compute total_entries
from the true count of visible entries (post-classification) while preserving
MAX_ENTRIES semantics; update the code paths around _list_sync, _classify_entry,
lines, MAX_ENTRIES, metadata["truncated"], and total_entries so truncated is
derived from the raw-truncated flag rather than len(lines) > MAX_ENTRIES.

---

Outside diff comments:
In `@src/ai_company/tools/_git_base.py`:
- Around line 321-346: The sandbox path still returns raw stderr on timeout;
update _sandbox_result_to_execution_result to apply the same stderr sanitization
and truncation used in _run_git (use _sanitize_stderr) before building the
ToolExecutionResult so control characters and oversized stderr are never
exposed; specifically, detect timeout cases in
_sandbox_result_to_execution_result (where self._sandbox results are converted),
run _sanitize_stderr(result.stderr.decode(...)) and pass the sanitized fragment
into the returned ToolExecutionResult content/fields (and any logged messages)
instead of the raw result.stderr.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3cf70ca9-ab84-4204-9bf4-bf478bb55b10

📥 Commits

Reviewing files that changed from the base of the PR and between 3c8394e and 1e4d779.

📒 Files selected for processing (18)

.claude/skills/aurelio-review-pr/SKILL.md
.claude/skills/pre-pr-review/SKILL.md
CLAUDE.md
DESIGN_SPEC.md
README.md
src/ai_company/engine/shutdown.py
src/ai_company/observability/events/execution.py
src/ai_company/observability/events/sandbox.py
src/ai_company/observability/events/tool.py
src/ai_company/tools/_git_base.py
src/ai_company/tools/_process_cleanup.py
src/ai_company/tools/file_system/list_directory.py
src/ai_company/tools/git_tools.py
src/ai_company/tools/sandbox/config.py
src/ai_company/tools/sandbox/subprocess_sandbox.py
tests/unit/engine/test_agent_engine_errors.py
tests/unit/tools/git/test_git_tools.py
tests/unit/tools/sandbox/test_subprocess_sandbox.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Agent
GitHub Check: Greptile Review

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Never use from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations
Use except A, B: syntax (without parentheses) as enforced by ruff for PEP 758 on Python 3.14
All public functions must have type hints and be compatible with mypy strict mode
All public classes and functions must have Google-style docstrings (enforced by ruff D rules)
Create new objects instead of mutating existing ones — implement immutability
For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction and wrap with MappingProxyType for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves
Never mix static config fields with mutable runtime fields in one Pydantic model
Use Pydantic v2 with BaseModel, model_validator, computed_field, and ConfigDict
Use @computed_field for derived values instead of storing and validating redundant fields (e.g., TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls) — use structured concurrency over bare create_task
Maintain line length of 88 characters (enforced by ruff)
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly — never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)

Files:

src/ai_company/observability/events/sandbox.py
src/ai_company/observability/events/tool.py
tests/unit/engine/test_agent_engine_errors.py
src/ai_company/tools/_git_base.py
src/ai_company/observability/events/execution.py
src/ai_company/tools/git_tools.py
src/ai_company/tools/sandbox/subprocess_sandbox.py
src/ai_company/tools/sandbox/config.py
tests/unit/tools/sandbox/test_subprocess_sandbox.py
src/ai_company/tools/_process_cleanup.py
tests/unit/tools/git/test_git_tools.py
src/ai_company/engine/shutdown.py
src/ai_company/tools/file_system/list_directory.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Every module with business logic must import logging via from ai_company.observability import get_logger and create logger = get_logger(__name__)
Never use import logging, logging.getLogger(), or print() in application code
Always name the logger variable logger (not _logger, not log)
Always use event name constants from ai_company.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider) — import directly from the specific events module
Use structured logging with logger.info(EVENT, key=value) syntax — never use format strings like logger.info('msg %s', val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Use DEBUG log level for object creation, internal flow, and entry/exit of key functions
All provider calls must go through BaseCompletionProvider which applies retry and rate limiting automatically
Never implement retry logic in driver subclasses or calling code — retry is handled by BaseCompletionProvider base class
Set RetryConfig and RateLimiterConfig per-provider in ProviderConfig
Retryable errors include RateLimitError, ProviderTimeoutError, ProviderConnectionError, and ProviderInternalError — all with is_retryable=True
Non-retryable errors must raise immediately without retry
RetryExhaustedError signals all retries have failed — engine layer catches this to trigger fallback chains
Rate limiter must respect RateLimitError.retry_after from providers and automatically pause future requests
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names: example-provider, example-large-001, example-medium-001, example-small-001, or large/medium/small as aliases

Files:

src/ai_company/observability/events/sandbox.py
src/ai_company/observability/events/tool.py
src/ai_company/tools/_git_base.py
src/ai_company/observability/events/execution.py
src/ai_company/tools/git_tools.py
src/ai_company/tools/sandbox/subprocess_sandbox.py
src/ai_company/tools/sandbox/config.py
src/ai_company/tools/_process_cleanup.py
src/ai_company/engine/shutdown.py
src/ai_company/tools/file_system/list_directory.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark unit tests with @pytest.mark.unit
Mark integration tests with @pytest.mark.integration
Mark end-to-end tests with @pytest.mark.e2e
Mark slow tests with @pytest.mark.slow
Maintain 80% minimum code coverage (enforced in CI)
All tests must complete within 30 seconds timeout
Use pytest-xdist via -n auto for parallel test execution
Prefer @pytest.mark.parametrize for testing similar cases
Use test-provider, test-small-001 (not vendor-specific names) in tests

Files:

tests/unit/engine/test_agent_engine_errors.py
tests/unit/tools/sandbox/test_subprocess_sandbox.py
tests/unit/tools/git/test_git_tools.py

🧠 Learnings (9)

📚 Learning: 2026-03-07T12:24:27.098Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T12:24:27.098Z
Learning: For trivial/docs-only changes use `/pre-pr-review quick` to skip agents but still run automated checks

Applied to files:

.claude/skills/aurelio-review-pr/SKILL.md
.claude/skills/pre-pr-review/SKILL.md
CLAUDE.md

📚 Learning: 2026-03-07T12:24:27.098Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T12:24:27.098Z
Learning: Surface improvements as suggestions, not silent changes — user decides whether to proceed

Applied to files:

.claude/skills/pre-pr-review/SKILL.md

📚 Learning: 2026-03-07T12:24:27.098Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T12:24:27.098Z
Learning: Present every implementation plan to the user for accept/deny before coding starts

Applied to files:

.claude/skills/pre-pr-review/SKILL.md

📚 Learning: 2026-03-07T12:24:27.098Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T12:24:27.098Z
Learning: NEVER create a PR directly — use `/pre-pr-review` to create PRs with automated checks and review agents

Applied to files:

.claude/skills/pre-pr-review/SKILL.md
CLAUDE.md

📚 Learning: 2026-03-07T12:24:27.098Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T12:24:27.098Z
Learning: Use `/aurelio-review-pr` to handle external reviewer feedback after PR is created

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-07T12:24:27.098Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T12:24:27.098Z
Learning: Run pre-commit hooks (trailing-whitespace, end-of-file-fixer, check-yaml, check-toml, check-json, check-merge-conflict, check-added-large-files, no-commit-to-branch, ruff check+format, gitleaks) before committing

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-07T12:24:27.098Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T12:24:27.098Z
Learning: After finishing an issue implementation, create a feature branch (`<type>/<slug>`), commit, and push — do NOT create a PR automatically

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-07T12:24:27.098Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T12:24:27.098Z
Learning: Update `DESIGN_SPEC.md` to reflect approved deviations when they occur

Applied to files:

DESIGN_SPEC.md

📚 Learning: 2026-03-07T12:24:27.098Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T12:24:27.098Z
Learning: Always read `DESIGN_SPEC.md` before implementing any feature or planning any issue

Applied to files:

DESIGN_SPEC.md

🧬 Code graph analysis (6)

src/ai_company/tools/_git_base.py (2)

src/ai_company/tools/_process_cleanup.py (1)

close_subprocess_transport (20-45)

src/ai_company/tools/base.py (1)

ToolExecutionResult (24-53)

src/ai_company/tools/git_tools.py (1)

src/ai_company/tools/_git_base.py (1)

_check_git_arg (198-227)

src/ai_company/tools/sandbox/subprocess_sandbox.py (1)

src/ai_company/tools/_process_cleanup.py (1)

close_subprocess_transport (20-45)

tests/unit/tools/sandbox/test_subprocess_sandbox.py (2)

src/ai_company/tools/sandbox/subprocess_sandbox.py (5)

config (108-110)

SubprocessSandbox (61-533)

workspace (113-115)

_get_safe_path_prefixes (193-210)

_build_filtered_env (212-255)

src/ai_company/tools/sandbox/config.py (1)

SubprocessSandboxConfig (8-83)

src/ai_company/tools/_process_cleanup.py (1)

src/ai_company/observability/_logger.py (1)

get_logger (8-28)

tests/unit/tools/git/test_git_tools.py (3)

tests/unit/tools/git/conftest.py (2)

status_tool (123-125)

_run_git (31-39)

src/ai_company/tools/git_tools.py (1)

GitStatusTool (60-123)

src/ai_company/tools/_git_base.py (1)

_run_git (439-471)

🪛 LanguageTool

.claude/skills/aurelio-review-pr/SKILL.md

[style] ~320-~320: To elevate your writing, try using a synonym here.
Context: ...ting" or "out of scope" — if a reviewer found a valid issue in code touched by (or ad...

(FIND_LOCATE)

.claude/skills/pre-pr-review/SKILL.md

[style] ~294-~294: To elevate your writing, try using a synonym here.
Context: ...isting" or "out of scope" — if an agent found a valid issue in code touched by (or ad...

(FIND_LOCATE)

CLAUDE.md

[style] ~132-~132: To elevate your writing, try using a synonym here.
Context: ...alid — never skip**: When review agents find valid issues (including pre-existing is...

(FIND_LOCATE)

🔇 Additional comments (15)

README.md (1)

39-44: LGTM!

The new System Requirements section is well-placed before "Getting Started" and accurately documents the runtime dependencies (Python 3.14+, uv, Git 2.x+). This aligns with the Tech Stack section and provides clear prerequisites for users.

tests/unit/engine/test_agent_engine_errors.py (1)

264-265: LGTM!

The updated assertions properly verify exception chaining behavior — confirming that when _handle_fatal_error encounters a secondary failure, the original RuntimeError is raised with the secondary ValueError correctly set as its __cause__. This provides stronger validation of the error propagation contract.

src/ai_company/observability/events/tool.py (1)

42-46: LGTM!

The new TOOL_SUBPROCESS_TRANSPORT_CLOSE_FAILED constant follows established patterns: proper Final[str] typing, consistent TOOL_* prefix, hierarchical dot-notation value, and organized under a clear section comment.

src/ai_company/observability/events/execution.py (1)

45-45: LGTM!

The new EXECUTION_SHUTDOWN_CLEANUP_FAILED constant is logically placed between the cleanup start and timeout events, completing the shutdown cleanup event hierarchy.

src/ai_company/observability/events/sandbox.py (1)

16-16: LGTM!

The new SANDBOX_KILL_FALLBACK constant is appropriately placed after SANDBOX_KILL_FAILED, logically grouping kill-related events together.

.claude/skills/aurelio-review-pr/SKILL.md (2)

320-327: LGTM!

The updated policy clearly communicates the new default behavior: implement all valid findings including pre-existing issues. The options list properly reflects "Implement all (Recommended)" as the preferred choice.

414-414: LGTM!

The new rule "Fix everything valid — never defer, never skip" is consistent with the updated Phase 6 guidance and reinforces the policy throughout the document.

.claude/skills/pre-pr-review/SKILL.md (2)

293-301: LGTM!

The policy update mirrors the changes in aurelio-review-pr/SKILL.md, maintaining consistency across review pipelines. Users retain the ability to skip items via AskUserQuestion, but the default now emphasizes comprehensive fixes.

399-399: LGTM!

The rule "Fix everything valid — never defer, never skip" completes the policy alignment with the Aurelio review pipeline.

CLAUDE.md (1)

132-132: LGTM!

The new "Fix everything valid — never skip" policy is appropriately placed in the Pre-PR Review section and provides clear, actionable guidance consistent with the SKILL.md updates.

src/ai_company/tools/sandbox/subprocess_sandbox.py (5)

17-32: LGTM!

The new imports are correctly placed and follow existing conventions. SANDBOX_KILL_FALLBACK is properly imported from the domain-specific events module, and close_subprocess_transport is imported from the shared process cleanup utility.

193-210: LGTM!

The refactoring from static method to instance method is correctly implemented. The method needs self to access self._config.extra_safe_path_prefixes, and the tuple concatenation at line 210 properly creates a new tuple without mutating the defaults.

289-306: LGTM!

The error handling improvement properly addresses the PR objective of logging OSError from killpg instead of suppressing it. The structured logging at WARNING level with pid and error context follows the coding guidelines. The fallback to proc.kill() with ProcessLookupError suppression is appropriate—other OSError types would propagate, which is explicit error handling rather than silent swallowing.

308-316: LGTM!

Good abstraction that delegates to the shared close_subprocess_transport utility. This eliminates duplication and centralizes the CPython-internal _transport access pattern. The docstring properly references the target function for implementation details, and the cleanup semantics (log and suppress exceptions) are appropriate for operations that shouldn't mask primary results.

469-481: LGTM!

The try/finally pattern properly ensures subprocess transport cleanup regardless of success, timeout, or exceptions. This addresses the Windows ResourceWarning issue mentioned in the PR objectives. The _close_process call is idempotent since close_subprocess_transport checks is_closing() before attempting to close.

src/ai_company/engine/shutdown.py

src/ai_company/tools/_process_cleanup.py

src/ai_company/tools/file_system/list_directory.py

Copilot

Pull request overview

This PR hardens subprocess-based tooling and shutdown behavior by extracting shared cleanup utilities, tightening git timeout error reporting, and adding validation/config knobs for sandbox PATH filtering, alongside related tests and documentation updates.

Changes:

Add shared asyncio subprocess transport cleanup helper and apply it in git + sandbox subprocess paths.
Improve git timeout error messages by including a sanitized stderr fragment; add config validation/tests for sandbox extra_safe_path_prefixes.
Improve shutdown cleanup robustness and observability (post-cancel exception logging, new cleanup-failure event, Windows stderr fallback).

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/unit/tools/sandbox/test_subprocess_sandbox.py	Adds unit tests for `extra_safe_path_prefixes` behavior and validation.
tests/unit/tools/git/test_git_tools.py	Adds tests ensuring git timeout errors include/sanitize stderr fragments.
tests/unit/engine/test_agent_engine_errors.py	Adjusts fatal-error chaining assertion to validate the secondary failure cause.
src/ai_company/tools/sandbox/subprocess_sandbox.py	Uses shared transport cleanup and logs killpg fallback failures via a new sandbox event.
src/ai_company/tools/sandbox/config.py	Introduces `extra_safe_path_prefixes` with Pydantic validation for non-empty absolute paths.
src/ai_company/tools/git_tools.py	Renames/refactors ref/arg flag-injection validation calls; fixes clone URL error spacing.
src/ai_company/tools/file_system/list_directory.py	Updates comments clarifying `total` semantics and truncation behavior.
src/ai_company/tools/_process_cleanup.py	New shared helper to close subprocess transports safely without masking primary errors.
src/ai_company/tools/_git_base.py	Adds stderr sanitization + inclusion on timeout; ensures transport cleanup in direct subprocess path.
src/ai_company/observability/events/tool.py	Adds tool event constant for subprocess transport close failures.
src/ai_company/observability/events/sandbox.py	Adds sandbox event constant for kill fallback logging.
src/ai_company/observability/events/execution.py	Adds execution shutdown cleanup-failure event constant.
src/ai_company/engine/shutdown.py	Refactors post-cancel exception retrieval/logging; improves cleanup failure logging and Windows signal fallback visibility.
README.md	Documents runtime/system requirements (Python 3.14+, uv, Git).
DESIGN_SPEC.md	Updates git hardening and sandbox feature descriptions for current behavior.
CLAUDE.md	Updates contributor workflow guidance (“fix everything valid — never skip”).
.claude/skills/pre-pr-review/SKILL.md	Aligns skill instructions with “implement all valid findings” default behavior.
.claude/skills/aurelio-review-pr/SKILL.md	Aligns skill instructions with “implement all valid findings” default behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-07T13:16:49Z

src/ai_company/tools/_git_base.py

+def _sanitize_stderr(raw: str) -> str:
+    """Strip control characters and truncate stderr for safe inclusion."""
+    return _CONTROL_CHAR_RE.sub("", raw)[:_MAX_STDERR_FRAGMENT]


_sanitize_stderr() claims to "strip control characters", but the regex leaves common control chars like \n, \r, and \t. Because the sanitized fragment is included both in the tool-facing timeout message and in the GIT_COMMAND_TIMEOUT log event, this can still allow multi-line/log-injection style output. Consider either expanding the regex (or normalizing whitespace) to remove/replace \r/\n/\t, or tighten the docstring to match the actual sanitization guarantees.

Copilot · 2026-03-07T13:16:50Z

src/ai_company/tools/file_system/list_directory.py

+        # total reflects classified entries (not raw filesystem entries).
+        # _list_sync's islice caps scanning at MAX_ENTRIES + 1, but
+        # _classify_entry may drop some entries (e.g. non-symlinks outside
+        # workspace), so len(lines) can be less than the islice cap.
+        # When truncated, the true directory size is unknown.
        total = len(lines)
        truncated = total > MAX_ENTRIES
        if truncated:


The new comment notes _classify_entry may drop entries after _list_sync caps scanning at MAX_ENTRIES + 1, but truncation is currently computed from len(lines) > MAX_ENTRIES. If many entries are dropped, the scan can still hit the cap while len(lines) stays <= MAX_ENTRIES, causing truncated=False and total_entries to be misleading even though the listing may be incomplete. Consider tracking truncation based on the raw entries length (or returning a separate truncated_raw flag / sentinel) rather than the post-classification lines length.

Copilot · 2026-03-07T13:16:50Z

tests/unit/tools/sandbox/test_subprocess_sandbox.py

+    def test_rejects_empty_string(self) -> None:
+        with pytest.raises(ValueError, match="non-empty absolute"):
+            SubprocessSandboxConfig(extra_safe_path_prefixes=("",))
+
+    def test_rejects_relative_path(self) -> None:
+        with pytest.raises(ValueError, match="non-empty absolute"):
+            SubprocessSandboxConfig(
+                extra_safe_path_prefixes=("relative/path",),
+            )


These config-validation tests use pytest.raises(ValueError), which will also match pydantic.ValidationError and makes it less clear that this is Pydantic boundary validation (and consistent with tests/unit/tools/sandbox/test_config.py). Consider asserting ValidationError explicitly (and importing it) so failures don’t accidentally pass due to unrelated ValueErrors.

Copilot · 2026-03-07T13:16:50Z

src/ai_company/engine/shutdown.py

+                    logger.debug(
+                        EXECUTION_SHUTDOWN_TASK_ERROR,
+                        error=(f"Post-cancel task exception: {type(exc).__name__}"),


_log_post_cancel_exceptions() currently logs only the exception type name. For post-cancel failures this is often not actionable without the exception message and some task identity (e.g., task.get_name() / id(task)). Consider logging str(exc) (or repr(exc)) and a task identifier to make these DEBUG logs useful for diagnosing shutdown issues.

Suggested change

logger.debug(

EXECUTION_SHUTDOWN_TASK_ERROR,

error=(f"Post-cancel task exception: {type(exc).__name__}"),

task_name = getattr(task, "get_name", None)

task_name = task_name() if callable(task_name) else None

logger.debug(

EXECUTION_SHUTDOWN_TASK_ERROR,

error=(

f"Post-cancel task exception: "

f"{type(exc).__name__}: {exc}"

),

task_name=task_name,

task_id=id(task),

error_message=str(exc),

greptile-apps · 2026-03-07T13:17:02Z

Greptile Summary

This PR is a well-scoped hardening pass across the subprocess, sandbox, and shutdown layers. It eliminates duplication via the new _process_cleanup.py shared utility, adds defense-in-depth against log/prompt injection via stderr sanitization, and improves shutdown resilience on both Windows and Unix. All changes are covered by new tests and pass the full 2457-test suite at 96% coverage.

Key changes:

_process_cleanup.py: New shared close_subprocess_transport() helper eliminates duplication between _git_base.py and subprocess_sandbox.py; uses getattr to safely access CPython-internal _transport, with guarded exception handling so cleanup never masks primary results.
_git_base.py: _sanitize_stderr() strips control characters (preventing log/LLM prompt injection), redacts embedded git credential URLs, and truncates to 500 chars before including stderr in timeout error messages. Credential redaction is applied before truncation to avoid partial credential leaks.
subprocess_sandbox.py: killpg OSError (e.g. EPERM) is now logged at WARNING via SANDBOX_KILL_FALLBACK before falling back to proc.kill(), ending a previous silent suppression.
shutdown.py: _log_post_cancel_exceptions() is extracted for clarity, handles asyncio.InvalidStateError explicitly at DEBUG, and the Windows _handle_signal_threadsafe path gains a sys.stderr.write last-resort fallback when no event loop is available.
config.py: A @field_validator for extra_safe_path_prefixes rejects empty strings and relative paths at the Pydantic boundary using platform-aware PurePath.is_absolute().
list_directory.py: Comment accuracy fix — total correctly reflects displayable classified lines, not the raw scan count.

Confidence Score: 5/5

This PR is safe to merge — all changes are defence-in-depth hardening with full test coverage, no regressions, and no behaviour-breaking modifications to existing APIs.
Every change is a targeted improvement (deduplication, sanitization, validation, logging, fallback handling) with corresponding tests. The refactoring is clean, the exception handling is intentional and documented, and credential redaction is applied before truncation in the correct order. No critical logic issues were identified.
No files require special attention.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph GitTool["_BaseGitTool._run_git_direct()"]
        A[Start subprocess] --> B{OSError?}
        B -- yes --> ERR1[ToolExecutionResult is_error=True]
        B -- no --> C[wait_for communicate timeout=deadline]
        C --> D{TimeoutError?}
        D -- no --> G[_process_git_output]
        D -- yes --> E[proc.kill]
        E --> F["wait_for communicate timeout=5s\n+ _sanitize_stderr()"]
        F --> ERR2["ToolExecutionResult is_error=True\n(with sanitized stderr fragment)"]
        C --> FINALLY[finally: close_subprocess_transport]
        F --> FINALLY
        G --> FINALLY
    end

    subgraph Sandbox["SubprocessSandbox.execute()"]
        S1[_validate_cwd] --> S2[_build_filtered_env\nallowlist + denylist + PATH filter]
        S2 --> S3[_spawn_process\nstart_new_session=True on Unix]
        S3 --> S4[_communicate_with_timeout]
        S4 --> S5{TimeoutError?}
        S5 -- yes --> S6["_kill_process\nkillpg on Unix / proc.kill fallback"]
        S6 --> S7[_drain_after_kill timeout=5s]
        S7 --> SFIN[finally: close_subprocess_transport]
        S5 -- no --> SFIN
    end

    subgraph Shutdown["ShutdownManager signal handling"]
        SH1{Platform?}
        SH1 -- Unix --> SH2["loop.add_signal_handler\n→ _handle_signal"]
        SH1 -- Windows --> SH3["signal.signal\n→ _handle_signal_threadsafe"]
        SH3 --> SH4{Running loop?}
        SH4 -- yes --> SH5["call_soon_threadsafe\n→ request_shutdown"]
        SH4 -- no / RuntimeError --> SH6["request_shutdown() directly"]
        SH6 --> SH7{Raised?}
        SH7 -- yes --> SH8["sys.stderr.write fallback"]
        SH8 --> SH9{stderr failed?}
        SH9 -- yes --> SH10["silent pass (noqa S110)"]
    end

_{Last reviewed commit: 940068d}

@staticmethod

…, and Copilot Source fixes: - Move asyncio/types out of TYPE_CHECKING for PEP 649 compatibility (#1, #2) - Guard is_closing() inside try/except in _process_cleanup.py (#4) - Normalize all control chars (incl. newlines/tabs) in _sanitize_stderr (#6) - Apply stderr sanitization to sandbox git path too (#3) - Fix list_directory truncation to use raw scan cap, not post-classification count (#7) - Narrow except to InvalidStateError + log task identity in shutdown (#8) - Add loop.stop() fallback when request_shutdown() fails in signal handlers (#9) - Include zombie diagnostic in _drain_after_kill stderr output (#10) - Remove @staticmethod from _log_post_cancel_exceptions (#17) - Add _process_cleanup.py to DESIGN_SPEC §15.3 (#12) and §11.1.1 (#16) Test additions: - New test_process_cleanup.py: 7 tests covering all transport states (#5) - Add _sanitize_stderr truncation test (#14) - Add _log_post_cancel_exceptions tests (4 tests) (#11) - Add signal handler recovery tests (3 tests) (#15) - Use ValidationError instead of ValueError in config tests (#13) - Update existing tests for new sanitization behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/tools/_git_base.py`:
- Around line 84-91: The _sanitize_stderr function currently only strips control
characters which can still leave credentials like https://user:token@host/... in
logs; update _sanitize_stderr to additionally redact embedded credentials by
applying a regex to replace any user:password@ (or user@) credential portion in
URLs with a safe placeholder (e.g., [REDACTED] or remove credentials) before
collapsing control characters and truncating; reference the existing
_sanitize_stderr function, reuse/_extend the _CONTROL_CHAR_RE step and
_MAX_STDERR_FRAGMENT truncation, and add a URL credential regex pass to ensure
stderr and ToolExecutionResult.content cannot leak https://user:pass@ or similar
patterns.

In `@src/ai_company/tools/sandbox/subprocess_sandbox.py`:
- Around line 193-210: The current _get_safe_path_prefixes appends
SubprocessSandboxConfig.extra_safe_path_prefixes without validating them,
allowing repo-controlled directories to shadow system binaries; update
_get_safe_path_prefixes to only include extras that are absolute, exist, and are
vetted as trusted system locations (e.g., reside under known system roots or
match a maintained whitelist), and reject any paths that are writable by
non-root users or that are inside the workspace or other user-writable
directories; also ensure _spawn_process’s command resolution relies only on the
validated prefixes returned by _get_safe_path_prefixes so untrusted extras are
never used.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7cbe6a4d-a113-4cd4-8617-4cc8a324543d

📥 Commits

Reviewing files that changed from the base of the PR and between 1e4d779 and 9a5a868.

📒 Files selected for processing (11)

DESIGN_SPEC.md
src/ai_company/engine/shutdown.py
src/ai_company/tools/_git_base.py
src/ai_company/tools/_process_cleanup.py
src/ai_company/tools/file_system/list_directory.py
src/ai_company/tools/sandbox/subprocess_sandbox.py
tests/unit/engine/test_shutdown.py
tests/unit/tools/file_system/test_list_directory.py
tests/unit/tools/git/test_git_tools.py
tests/unit/tools/sandbox/test_subprocess_sandbox.py
tests/unit/tools/test_process_cleanup.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Greptile Review

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Never use from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations
Use except A, B: syntax (without parentheses) as enforced by ruff for PEP 758 on Python 3.14
All public functions must have type hints and be compatible with mypy strict mode
All public classes and functions must have Google-style docstrings (enforced by ruff D rules)
Create new objects instead of mutating existing ones — implement immutability
For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction and wrap with MappingProxyType for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves
Never mix static config fields with mutable runtime fields in one Pydantic model
Use Pydantic v2 with BaseModel, model_validator, computed_field, and ConfigDict
Use @computed_field for derived values instead of storing and validating redundant fields (e.g., TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g., multiple tool invocations, parallel agent calls) — use structured concurrency over bare create_task
Maintain line length of 88 characters (enforced by ruff)
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly — never silently swallow exceptions
Validate at system boundaries (user input, external APIs, config files)

Files:

tests/unit/tools/sandbox/test_subprocess_sandbox.py
tests/unit/tools/test_process_cleanup.py
tests/unit/tools/git/test_git_tools.py
src/ai_company/tools/_process_cleanup.py
src/ai_company/tools/file_system/list_directory.py
src/ai_company/engine/shutdown.py
tests/unit/engine/test_shutdown.py
src/ai_company/tools/sandbox/subprocess_sandbox.py
tests/unit/tools/file_system/test_list_directory.py
src/ai_company/tools/_git_base.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Mark unit tests with @pytest.mark.unit
Mark integration tests with @pytest.mark.integration
Mark end-to-end tests with @pytest.mark.e2e
Mark slow tests with @pytest.mark.slow
Maintain 80% minimum code coverage (enforced in CI)
All tests must complete within 30 seconds timeout
Use pytest-xdist via -n auto for parallel test execution
Prefer @pytest.mark.parametrize for testing similar cases
Use test-provider, test-small-001 (not vendor-specific names) in tests

Files:

tests/unit/tools/sandbox/test_subprocess_sandbox.py
tests/unit/tools/test_process_cleanup.py
tests/unit/tools/git/test_git_tools.py
tests/unit/engine/test_shutdown.py
tests/unit/tools/file_system/test_list_directory.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Every module with business logic must import logging via from ai_company.observability import get_logger and create logger = get_logger(__name__)
Never use import logging, logging.getLogger(), or print() in application code
Always name the logger variable logger (not _logger, not log)
Always use event name constants from ai_company.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider) — import directly from the specific events module
Use structured logging with logger.info(EVENT, key=value) syntax — never use format strings like logger.info('msg %s', val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
Use DEBUG log level for object creation, internal flow, and entry/exit of key functions
All provider calls must go through BaseCompletionProvider which applies retry and rate limiting automatically
Never implement retry logic in driver subclasses or calling code — retry is handled by BaseCompletionProvider base class
Set RetryConfig and RateLimiterConfig per-provider in ProviderConfig
Retryable errors include RateLimitError, ProviderTimeoutError, ProviderConnectionError, and ProviderInternalError — all with is_retryable=True
Non-retryable errors must raise immediately without retry
RetryExhaustedError signals all retries have failed — engine layer catches this to trigger fallback chains
Rate limiter must respect RateLimitError.retry_after from providers and automatically pause future requests
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names: example-provider, example-large-001, example-medium-001, example-small-001, or large/medium/small as aliases

Files:

src/ai_company/tools/_process_cleanup.py
src/ai_company/tools/file_system/list_directory.py
src/ai_company/engine/shutdown.py
src/ai_company/tools/sandbox/subprocess_sandbox.py
src/ai_company/tools/_git_base.py

🧠 Learnings (4)

📚 Learning: 2026-03-07T12:24:27.098Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T12:24:27.098Z
Learning: Applies to **/*.py : Validate at system boundaries (user input, external APIs, config files)

Applied to files:

tests/unit/tools/sandbox/test_subprocess_sandbox.py

📚 Learning: 2026-03-07T12:24:27.098Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T12:24:27.098Z
Learning: Applies to tests/**/*.py : All tests must complete within 30 seconds timeout

Applied to files:

tests/unit/tools/git/test_git_tools.py

📚 Learning: 2026-03-07T12:24:27.098Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T12:24:27.098Z
Learning: Update `DESIGN_SPEC.md` to reflect approved deviations when they occur

Applied to files:

DESIGN_SPEC.md

📚 Learning: 2026-03-07T12:24:27.098Z

Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-07T12:24:27.098Z
Learning: Applies to **/*.py : Handle errors explicitly — never silently swallow exceptions

Applied to files:

src/ai_company/engine/shutdown.py

🧬 Code graph analysis (7)

tests/unit/tools/sandbox/test_subprocess_sandbox.py (2)

src/ai_company/tools/sandbox/subprocess_sandbox.py (5)

config (108-110)

SubprocessSandbox (61-533)

workspace (113-115)

_get_safe_path_prefixes (193-210)

_build_filtered_env (212-255)

src/ai_company/tools/sandbox/config.py (1)

SubprocessSandboxConfig (8-83)

tests/unit/tools/test_process_cleanup.py (1)

src/ai_company/tools/_process_cleanup.py (1)

close_subprocess_transport (17-44)

src/ai_company/tools/_process_cleanup.py (1)

src/ai_company/observability/_logger.py (1)

get_logger (8-28)

src/ai_company/engine/shutdown.py (1)

src/ai_company/observability/_logger.py (1)

get_logger (8-28)

tests/unit/engine/test_shutdown.py (1)

src/ai_company/engine/shutdown.py (9)

strategy (352-354)

CooperativeTimeoutStrategy (117-325)

_log_post_cancel_exceptions (249-278)

ShutdownStrategy (82-114)

request_shutdown (85-87)

request_shutdown (142-144)

ShutdownManager (328-500)

_handle_signal (375-392)

_handle_signal_threadsafe (394-441)

src/ai_company/tools/sandbox/subprocess_sandbox.py (1)

src/ai_company/tools/_process_cleanup.py (1)

close_subprocess_transport (17-44)

src/ai_company/tools/_git_base.py (2)

src/ai_company/tools/_process_cleanup.py (1)

close_subprocess_transport (17-44)

src/ai_company/tools/base.py (1)

ToolExecutionResult (24-53)

🔇 Additional comments (15)

src/ai_company/tools/file_system/list_directory.py (3)

74-113: LGTM! Truncation semantics correctly fixed.

The new raw_capped return value properly separates raw scan truncation detection from the filtered line count. This addresses the prior concern where truncation was incorrectly inferred from len(lines) > MAX_ENTRIES, which could underreport when _classify_entry drops entries. Now:

raw_capped reflects whether the raw filesystem scan hit MAX_ENTRIES

total_entries correctly counts only visible/classified entries

The truncation message and metadata accurately reflect the raw scan limit

231-265: LGTM!

The _format_listing_result method correctly uses raw_capped for truncation signaling. The docstring accurately describes the parameter semantics.

293-321: LGTM!

The execute method correctly unpacks the new tuple return from _list_sync and threads raw_capped through to _format_listing_result.

tests/unit/tools/file_system/test_list_directory.py (1)

109-120: LGTM!

The assertion change from > MAX_ENTRIES to == MAX_ENTRIES is correct. The test creates plain text files that all pass _classify_entry, so exactly MAX_ENTRIES entries appear in the output when truncated.

tests/unit/tools/sandbox/test_subprocess_sandbox.py (1)

448-522: LGTM! Comprehensive test coverage for extra_safe_path_prefixes.

The tests thoroughly cover:

Integration with _get_safe_path_prefixes() and _build_filtered_env()

Platform-specific behavior (Windows vs Unix paths)

Boundary validation (empty string and relative path rejection via ValidationError)

Default behavior preservation

tests/unit/engine/test_shutdown.py (2)

420-459: LGTM! Good coverage for _log_post_cancel_exceptions.

The tests verify the helper correctly:

Skips cancelled tasks without calling exception()

Retrieves and logs exceptions from non-cancelled tasks

Handles tasks with no exception gracefully

Catches InvalidStateError without propagating

464-504: LGTM! Signal handler recovery tests are comprehensive.

The tests verify resilience in edge cases:

Unix path falls back to loop.stop() when request_shutdown() raises

Windows no-loop path writes to stderr as last resort

Stderr write failures are suppressed (no exception raised)

tests/unit/tools/git/test_git_tools.py (1)

729-816: LGTM! Good test coverage for stderr sanitization on timeout.

The tests thoroughly verify:

Stderr fragment inclusion in timeout messages

Control character sanitization (null, bell, unit separator removed)

Truncation to 500 characters

The mocking pattern correctly simulates timeout scenarios by having the first communicate() call block indefinitely while the second returns the stderr payload.

tests/unit/tools/test_process_cleanup.py (1)

12-70: LGTM! Comprehensive test coverage for close_subprocess_transport.

The tests cover all edge cases including:

No-op paths (None/missing _transport, already closing)

Normal operation (transport closed when open)

Exception suppression (from both is_closing() and close())

Resilience to non-transport objects in _transport

src/ai_company/tools/_process_cleanup.py (1)

17-44: LGTM! Past review concern addressed.

The is_closing() call is now correctly guarded inside the try/except block (line 37), addressing the prior concern about unguarded exceptions. The implementation:

Safely accesses _transport via getattr

Returns early if transport is None

Guards both is_closing() and close() calls with exception handling

Logs failures at DEBUG with exc_info=True for observability

src/ai_company/engine/shutdown.py (4)

249-278: LGTM! Post-cancel exception logging implemented correctly.

The _log_post_cancel_exceptions helper:

Retrieves exceptions to prevent asyncio's "Task exception was never retrieved" warning

Logs non-cancelled tasks with exceptions at DEBUG level (appropriate for shutdown diagnostics)

Handles InvalidStateError gracefully with task name context

This addresses the past review concern about silently swallowing exceptions.

375-392: LGTM! Unix signal handler with resilient fallback.

The handler correctly:

Logs the signal reception

Attempts request_shutdown()

Falls back to loop.stop() with suppressed exceptions as a last resort to ensure the process can exit

The contextlib.suppress(Exception) on line 391 is appropriate here — it's the absolute last-resort fallback to prevent a stuck process.

394-441: LGTM! Windows signal handler with comprehensive fallback chain.

The handler implements a resilient fallback chain:

Try to schedule via call_soon_threadsafe (preferred)

If no loop, call request_shutdown() directly

If that fails, write to stderr (avoids structlog's lock acquisition)

If stderr fails, silently pass (last resort)

The except Exception: pass on line 440-441 is intentional and acceptable — at this point, all fallback mechanisms have failed, and the process is in an unrecoverable signal-handling state. The # noqa: S110 acknowledges this intentional suppression.

306-312: LGTM! Cleanup failure logging uses distinct event constant.

The switch to EXECUTION_SHUTDOWN_CLEANUP_FAILED for callback exceptions provides better observability by distinguishing cleanup failures from normal cleanup progress.

src/ai_company/tools/sandbox/subprocess_sandbox.py (1)

468-481: No review comment provided. Please provide the review comment within <review_comment> tags along with any relevant context (repository information, files being reviewed, etc.) to proceed with the verification and rewriting task.

src/ai_company/tools/_git_base.py

coderabbitai · 2026-03-07T13:43:44Z

src/ai_company/tools/sandbox/subprocess_sandbox.py

+    def _get_safe_path_prefixes(self) -> tuple[str, ...]:
+        """Return safe PATH prefixes for the current platform.
+
+        Combines built-in platform defaults with any extra prefixes
+        from ``SubprocessSandboxConfig.extra_safe_path_prefixes``.
+        """
+        defaults: tuple[str, ...]
        if os.name == "nt":
            system_root = os.environ.get("SYSTEMROOT", r"C:\WINDOWS")
-            return (
+            defaults = (
                system_root,
                str(Path(system_root) / "system32"),
                r"C:\Program Files\Git",
                r"C:\Program Files (x86)\Git",
            )
-        return ("/usr/bin", "/usr/local/bin", "/bin", "/usr/sbin", "/sbin")
+        else:
+            defaults = ("/usr/bin", "/usr/local/bin", "/bin", "/usr/sbin", "/sbin")
+        return defaults + self._config.extra_safe_path_prefixes


⚠️ Potential issue | 🟠 Major

Reject repo-controlled PATH prefixes.

Appending arbitrary absolute extra_safe_path_prefixes weakens the “system binaries only” guarantee. Because _spawn_process() still resolves bare commands through PATH, a prefix under the workspace or another writable directory lets a repo shadow git or python with its own executable. Validate these extras as trusted system locations, not just “absolute”.

As per coding guidelines, "Validate at system boundaries (user input, external APIs, config files)".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/tools/sandbox/subprocess_sandbox.py` around lines 193 - 210, The current _get_safe_path_prefixes appends SubprocessSandboxConfig.extra_safe_path_prefixes without validating them, allowing repo-controlled directories to shadow system binaries; update _get_safe_path_prefixes to only include extras that are absolute, exist, and are vetted as trusted system locations (e.g., reside under known system roots or match a maintained whitelist), and reject any paths that are writable by non-root users or that are inside the workspace or other user-writable directories; also ensure _spawn_process’s command resolution relies only on the validated prefixes returned by _get_safe_path_prefixes so untrusted extras are never used.

…results Apply _CREDENTIAL_RE to stderr fragments to prevent leaking https://user:token@host URLs in timeout error messages and logs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-07T13:51:21Z

tests/unit/tools/test_process_cleanup.py

+        proc = MagicMock()
+        # _transport is a string (not a transport object)
+        type(proc)._transport = PropertyMock(return_value="not-a-transport")
+        # Should not raise
+        close_subprocess_transport(proc)


type(proc)._transport = PropertyMock(...) mutates the global unittest.mock.MagicMock class (since type(proc) is MagicMock), which can leak into unrelated tests and even make later assignments like mock._transport = None raise AttributeError. Use a dedicated dummy class/instance with a property, or patch the attribute via patch.object(..., new_callable=PropertyMock) in a context manager so it is restored after the test.

🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>

Aureliolo and others added 3 commits March 7, 2026 13:41

docs: add system requirements section to README

ca9bc6c

Document Python 3.14+, uv, and Git 2.x+ as system-level prerequisites. Git is a runtime dependency for the built-in git tools (subprocess-based, not managed by pip/uv). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings March 7, 2026 13:10

Copilot started reviewing on behalf of Aureliolo March 7, 2026 13:10 View session

gemini-code-assist bot reviewed Mar 7, 2026

View reviewed changes

coderabbitai bot reviewed Mar 7, 2026

View reviewed changes

src/ai_company/engine/shutdown.py Outdated Show resolved Hide resolved

src/ai_company/tools/_process_cleanup.py Show resolved Hide resolved

src/ai_company/tools/file_system/list_directory.py Outdated Show resolved Hide resolved

Copilot AI reviewed Mar 7, 2026

View reviewed changes

coderabbitai bot reviewed Mar 7, 2026

View reviewed changes

fix: redact credentials in _sanitize_stderr before including in tool …

940068d

…results Apply _CREDENTIAL_RE to stderr fragments to prevent leaking https://user:token@host URLs in timeout error messages and logs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings March 7, 2026 13:45

Aureliolo merged commit d1fe1fb into main Mar 7, 2026
10 of 11 checks passed

Aureliolo deleted the fix/post-merge-ci-and-review-feedback branch March 7, 2026 13:46

Copilot started reviewing on behalf of Aureliolo March 7, 2026 13:46 View session

Copilot AI reviewed Mar 7, 2026

View reviewed changes

Aureliolo mentioned this pull request Mar 10, 2026

chore(main): release ai-company 0.1.1 #282

Merged

Aureliolo mentioned this pull request Mar 10, 2026

chore(main): release 0.1.0 #283

Merged

This was referenced Mar 15, 2026

chore(main): release 0.2.4 #431

Merged

chore(main): release 0.2.0 #442

Closed

chore(main): release 0.2.5 #447

Merged

chore(main): release 0.2.0 #460

Closed

chore(main): release 0.2.0 #471

Closed

-                    logger.debug(
-                        EXECUTION_SHUTDOWN_TASK_ERROR,
-                        error=(f"Post-cancel task exception: {type(exc).__name__}"),
+                    task_name = getattr(task, "get_name", None)
+                    task_name = task_name() if callable(task_name) else None
+                    logger.debug(
+                        EXECUTION_SHUTDOWN_TASK_ERROR,
+                        error=(
+                            f"Post-cancel task exception: "
+                            f"{type(exc).__name__}: {exc}"
+                        ),
+                        task_name=task_name,
+                        task_id=id(task),
+                        error_message=str(exc),

Conversation

Aureliolo commented Mar 7, 2026

Summary

Test plan

Review coverage

Uh oh!

github-actions bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

coderabbitai bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist bot commented Mar 7, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Flowchart

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

github-actions bot commented Mar 7, 2026 •

edited

Loading

coderabbitai bot commented Mar 7, 2026 •

edited

Loading

greptile-apps bot commented Mar 7, 2026 •

edited

Loading