Conversation
… tests, hookify rules (#148) Phase 1: Sync ruff rev to v0.15.5, add NotBlankStr to template schema fields, add parametrize convention to CLAUDE.md. Phase 2: Add missing loggers to config/utils.py, templates/presets.py, and observability/correlation.py. Create events/correlation.py module. Phase 3: Fix exception causality in agent_engine (raise from None), add GitBranchTool unknown-action guard, add ref2-without-ref1 validation in GitDiffTool, add git credential sanitization in log output, fix _UNSAFE_GLOB_RE trailing .. regex, block ** glob without recursive flag. Phase 4: Refactor 32 long functions (>50 lines) by extracting private helpers. No behavioral changes — same class/module, same public API. Phase 5: Add 3 new test files (test_errors, test_base_fs_tool, test_prompt_template), expand 9 existing test files with 46+ new tests, parametrize 4 test files to reduce duplication. Coverage: 96.52%. Phase 6: Add 3 hookify prevention rules (PEP 758 except syntax, missing logger detection, function length reminder). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pre-reviewed by 10 agents, 13 findings addressed: - Remove redundant isinstance check in react_loop._process_turn_response - Add MemoryError/RecursionError to Raises docs (run, execute, invoke_all) - Add `as exc` to personality preset KeyError catch in renderer - Add _check_ref guard on --author/--since/--until in GitLogTool - Extract _build_filter_args helper to reduce complexity - Fix _handle_retryable_error docstring to clarify return semantics - Change TemplateAgentConfig.model from str to NotBlankStr - Add events/correlation.py to DESIGN_SPEC.md §15.3 - Update crash recovery status to "Adopted" in DESIGN_SPEC.md §15.5 - Add `as exc` to built-in template except block in loader - Restore discovery failure info in load_config Raises docstring - Fix total_entries metadata to reflect pre-truncation count - Add truncated flag to listing metadata
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (19)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds file-event hook configs; introduces new observability event constants and logging in several modules; refactors engine, prompt, react loop, provider, and tool code into smaller helpers; modularizes file-system tools; tightens template schema types; expands and parametrizes tests; bumps ruff pre-commit pin. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant AgentEngine
participant ReactLoop
participant PostPipeline
participant ErrorHandler
Client->>AgentEngine: run(identity, task, ctx)
AgentEngine->>ReactLoop: execute/react loop (turns, tool invocations)
ReactLoop-->>AgentEngine: ExecutionResult
alt success
AgentEngine->>PostPipeline: _post_execution_pipeline(result, identity, agent_id, task_id)
PostPipeline->>PostPipeline: record costs & apply transitions
PostPipeline->>AgentEngine: _build_and_log_result -> AgentRunResult
AgentEngine-->>Client: AgentRunResult
else fatal error
AgentEngine->>ErrorHandler: _handle_fatal_error(exc)
ErrorHandler->>ErrorHandler: _build_error_execution / _build_error_prompt
ErrorHandler-->>AgentEngine: error ExecutionResult
AgentEngine->>PostPipeline: _post_execution_pipeline(error_result, ...)
PostPipeline-->>AgentEngine: AgentRunResult (error)
AgentEngine-->>Client: AgentRunResult (error)
end
sequenceDiagram
participant User
participant TemplateLoader
participant Builtins
participant UserStore
participant Validator
User->>TemplateLoader: load_template(name)
TemplateLoader->>Builtins: _load_builtin(name)
alt builtin found
Builtins-->>TemplateLoader: Template
else not found
TemplateLoader->>UserStore: _collect_user_templates / _load_from_file
UserStore->>Validator: _validate_template_structure(data, source)
Validator-->>UserStore: validated dict
UserStore-->>TemplateLoader: Template
end
TemplateLoader-->>User: TemplateInfo
Estimated code review effort🎯 4 (Complex) | ⏱️ ~70 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request delivers a comprehensive cleanup and enhancement of the codebase following a recent audit. The changes focus on improving code quality through extensive refactoring, bolstering security measures, and expanding test coverage to ensure robustness. It also introduces new development guardrails and refines logging practices for better system understanding. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request performs a wide-ranging cleanup and refactoring across the codebase, including converting except syntax to follow PEP 758, adding missing loggers, fixing bugs related to unsafe glob patterns and git command handling, and refactoring long functions. The test suite has been substantially improved by consolidating repetitive tests using @pytest.mark.parametrize and adding new tests, and new hookify rules have been introduced. While the PR aims to enhance security through general credential sanitization and redaction of credentials in logs, a critical security issue was identified: authentication tokens in URLs may still be leaked to logs if they do not follow a specific 'user:pass' format within the git command logging sanitization logic. This vulnerability requires immediate attention. Additionally, the adoption of Python 3.13 syntax for exception handling (PEP 758) is a deliberate design choice but may impact compatibility with older Python versions.
src/ai_company/tools/_git_base.py
Outdated
|
|
||
| _DEFAULT_TIMEOUT: Final[float] = 30.0 | ||
|
|
||
| _CREDENTIAL_RE = re.compile(r"(https?://)[^@/:]+:[^@/:]+@") |
There was a problem hiding this comment.
The regular expression used for redacting credentials in git command logs is too restrictive. It specifically looks for a colon between the username and password/token ([^@/:]+:[^@/:]+@). However, many git hosting services (like GitHub) allow using a token as the username without a password (e.g., https://<token>@github.com). In such cases, the regex will fail to match, and the full URL including the sensitive token will be logged in plain text. This poses a risk of credential leakage in application logs.
| _CREDENTIAL_RE = re.compile(r"(https?://)[^@/:]+:[^@/:]+@") | |
| _CREDENTIAL_RE = re.compile(r"(https?://)[^@/]+@") |
Greptile SummaryLarge-scale post-audit cleanup across 49 files touching the engine, tools, templates, providers, and observability layers. The changes are primarily structural (32+ long functions decomposed into focused helpers), with targeted bug fixes (unsafe glob regex, Key changes and observations:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller
participant _run_git
participant _build_git_env
participant _start_git_process
participant _await_git_process
participant _process_git_output
Caller->>_run_git: args, cwd, deadline
_run_git->>_build_git_env: ()
_build_git_env-->>_run_git: hardened env dict
_run_git->>_start_git_process: args, work_dir, env
alt OSError
_start_git_process-->>_run_git: ToolExecutionResult(is_error=True)
_run_git-->>Caller: error result
else success
_start_git_process-->>_run_git: asyncio.Process
_run_git->>_await_git_process: proc, args, deadline
alt TimeoutError
_await_git_process-->>_run_git: ToolExecutionResult(is_error=True)
_run_git-->>Caller: error result
else success
_await_git_process-->>_run_git: (stdout_bytes, stderr_bytes)
_run_git->>_process_git_output: args, returncode, stdout, stderr
_process_git_output-->>_run_git: ToolExecutionResult
_run_git-->>Caller: result
end
end
|
There was a problem hiding this comment.
Pull request overview
This is a comprehensive post-audit cleanup PR touching ~49 files with refactoring, bug fixes, security improvements, and test consolidation. The changes stem from a structured review by 10 specialized agents and systematically address code quality issues across the codebase.
Changes:
- Bug fixes & security: Fixes unsafe glob regex (trailing
..),ref2withoutref1validation inGitDiffTool, unknown branch action fallback, credential sanitization in git command logging,**withoutrecursive=Trueguard, and flag-injection guards on--author/--since/--untilinGitLogTool. - Function decomposition: Long functions (32+) decomposed into focused helpers under 50 lines per CLAUDE.md convention across tools, engine, templates, providers, and config modules.
- Test improvements: Repetitive tests consolidated with
@pytest.mark.parametrize, new edge-case coverage added (3 new test files), and test assertions tightened (e.g.,total_entriesmetadata,__cause__ is None).
Reviewed changes
Copilot reviewed 49 out of 49 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
src/ai_company/tools/_git_base.py |
Credential sanitization via _sanitize_command(), decomposed _run_git() into helpers |
src/ai_company/tools/git_tools.py |
Flag-injection guard, ref2/ref1 validation, unknown branch action fallback, schema extraction |
src/ai_company/tools/file_system/list_directory.py |
Fixed glob regex, added **/recursive guard, extracted _classify_entry, _format_listing_result |
src/ai_company/tools/file_system/write_file.py |
Decomposed execute() into _validate_write_args, _resolve_write_path, _perform_write |
src/ai_company/tools/file_system/read_file.py |
Decomposed execute() into _validate_read_args, _preflight_check_file, _perform_read |
src/ai_company/tools/file_system/edit_file.py |
Decomposed execute() into _validate_edit_args, _preflight_check_file, _perform_edit |
src/ai_company/tools/file_system/delete_file.py |
Extracted _handle_delete_error from execute |
src/ai_company/tools/invoker.py |
Extracted _safe_deepcopy_args, _raise_fatal_errors; docstring cleanup |
src/ai_company/templates/schema.py |
model, workflow, communication upgraded to NotBlankStr |
src/ai_company/templates/renderer.py |
Extracted _validate_list, _extract_numeric_config, _expand_single_agent |
src/ai_company/templates/presets.py |
Added logger and warning log in get_personality_preset |
src/ai_company/templates/loader.py |
Fixed PEP 758 except syntax bug, extracted _collect_user_templates, _validate_template_structure |
src/ai_company/providers/registry.py |
Extracted _resolve_factory from _build_driver |
src/ai_company/providers/resilience/retry.py |
Extracted _handle_retryable_error |
src/ai_company/providers/base.py |
Docstring trimming |
src/ai_company/providers/routing/_strategy_helpers.py |
Docstring trimming |
src/ai_company/observability/events/correlation.py |
New module with correlation event constants |
src/ai_company/observability/events/config.py |
Added CONFIG_CONVERSION_ERROR constant |
src/ai_company/observability/correlation.py |
Added logger and structured event logging for decorator misuse |
src/ai_company/engine/agent_engine.py |
Decomposed functions, raise exc from None fix, extracted _build_error_prompt |
src/ai_company/engine/react_loop.py |
Extracted _prepare_loop, _process_turn_response, _missing_invoker_error |
src/ai_company/engine/prompt.py |
Extracted _validate_max_tokens, _log_and_return, _build_prompt_result, _log_trim_results |
src/ai_company/engine/task_execution.py |
Extracted _build_transition_updates |
src/ai_company/config/utils.py |
Added logger and warning logs for conversion errors |
src/ai_company/config/loader.py |
Decomposed load_config into _load_and_merge_overrides, _finalize_config |
src/ai_company/budget/tracker.py |
Docstring trimming |
DESIGN_SPEC.md |
Added events/correlation.py to tree, crash recovery status → "Adopted" |
CLAUDE.md |
Added parametrize testing convention |
.pre-commit-config.yaml |
Ruff version bump v0.15.4 → v0.15.5 |
.claude/hookify.*.md |
Three new hookify prevention rules |
tests/unit/tools/test_permissions.py |
Consolidated 3 test classes into parametrized test |
tests/unit/tools/git/test_git_tools.py |
New tests for flag injection, ref2 without ref1, unknown action, credential sanitization |
tests/unit/tools/file_system/test_list_directory.py |
New tests for glob edge cases, ** guard, OSError, absolute patterns |
tests/unit/tools/file_system/test_write_file.py |
New tests for write error handling, temp file cleanup |
tests/unit/tools/file_system/test_base_fs_tool.py |
New file testing _map_os_error |
tests/unit/templates/test_schema.py |
Tests for blank model, workflow, communication rejection |
tests/unit/templates/test_renderer.py |
Tests for _parse_rendered_yaml, _build_departments, _validate_as_root_config, _collect_variables |
tests/unit/templates/test_loader.py |
Tests for user template errors, path traversal, _to_float |
tests/unit/templates/test_errors.py |
New file testing template error formatting |
tests/unit/observability/test_sinks.py |
Tests for file handler error paths |
tests/unit/observability/test_events.py |
Added correlation module to discovery test |
tests/unit/engine/test_prompt_template.py |
New file testing autonomy instructions and default template |
tests/unit/engine/test_agent_engine_errors.py |
Tightened assertion for __cause__ is None |
tests/unit/engine/test_agent_engine.py |
New test for task assigned to different agent |
tests/unit/core/test_task_transitions.py |
Consolidated tests with parametrize, added guard edge cases |
tests/unit/core/test_task.py |
Consolidated whitespace/assignment tests with parametrize |
tests/unit/core/test_role_catalog.py |
Added guard/invariant tests |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/ai_company/templates/renderer.py
Outdated
| except KeyError as exc: | ||
| logger.warning( | ||
| TEMPLATE_PERSONALITY_PRESET_UNKNOWN, | ||
| preset_name=preset_name, | ||
| agent_name=name, | ||
| available=sorted(PERSONALITY_PRESETS), | ||
| error=str(exc), | ||
| ) |
There was a problem hiding this comment.
When get_personality_preset() is called with an unknown name, the warning TEMPLATE_PERSONALITY_PRESET_UNKNOWN is now logged twice: once inside get_personality_preset() at presets.py:90-94 (newly added), and again in the except KeyError handler here in the renderer. This produces duplicate log entries for the same event. Either the logging in presets.py should be removed (since the caller already logs it with more context like agent_name), or this logging here should be removed since presets.py now handles it.
| except KeyError as exc: | |
| logger.warning( | |
| TEMPLATE_PERSONALITY_PRESET_UNKNOWN, | |
| preset_name=preset_name, | |
| agent_name=name, | |
| available=sorted(PERSONALITY_PRESETS), | |
| error=str(exc), | |
| ) | |
| except KeyError: | |
| # Unknown personality preset: logging is handled inside get_personality_preset(). | |
| # Swallow the error so the agent is created without a personality. | |
| pass |
There was a problem hiding this comment.
Actionable comments posted: 21
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
tests/unit/tools/git/test_git_tools.py (1)
56-69:⚠️ Potential issue | 🟡 MinorFix: Pipeline failure due to
outside_diralready existing.The pipeline error indicates
FileExistsError: [Errno 17] File exists: '/tmp/pytest-of-runner/pytest-0/popen-gw0/outside_dir'. This occurs becauseoutside.mkdir()fails when running tests in parallel (pytest-xdist) and another test has already created this directory.Use
exist_ok=Trueor generate a unique directory name to avoid collisions in parallel test runs.🛠️ Proposed fix
async def test_symlink_escape_blocked(self, git_repo: Path) -> None: - outside = git_repo.parent / "outside_dir" - outside.mkdir() + outside = git_repo.parent / f"outside_dir_{id(self)}" + outside.mkdir(exist_ok=True) (outside / "secret.txt").write_text("secret")Alternatively, use
tmp_path_factorywith a unique name ortempfile.mkdtemp()for guaranteed isolation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/tools/git/test_git_tools.py` around lines 56 - 69, The test test_symlink_escape_blocked creates outside = git_repo.parent / "outside_dir" and calls outside.mkdir() which can raise FileExistsError in parallel runs; change the mkdir call to use exist_ok=True (i.e., outside.mkdir(exist_ok=True)) or generate a unique temp dir (e.g., via tmp_path_factory or tempfile.mkdtemp()) so the setup for outside_dir is robust under pytest-xdist, leaving the rest of the test (link.symlink_to and invoking GitDiffTool.execute) unchanged.src/ai_company/tools/file_system/edit_file.py (1)
193-249: 🧹 Nitpick | 🔵 TrivialConsider splitting
_perform_editto meet the 50-line guideline.This method spans 57 lines, exceeding the 50-line limit. While the logic is cohesive and readable, you could extract the result-construction (lines 218-249) into a small helper like
_build_edit_result(user_path, count)to bring this under the threshold.♻️ Suggested refactor to extract result building
+ def _build_edit_result( + self, + user_path: str, + count: int, + ) -> ToolExecutionResult: + """Build the result after a successful edit operation.""" + if count == 0: + logger.warning( + TOOL_FS_EDIT_NOT_FOUND, + path=user_path, + old_text_len=len(old_text), + ) + return ToolExecutionResult( + content=f"Text not found in {user_path}.", + is_error=True, + metadata={ + "path": user_path, + "occurrences_found": 0, + "occurrences_replaced": 0, + }, + ) + msg = f"Replaced 1 occurrence in {user_path}" + if count > 1: + msg += f" (warning: {count} total occurrences found, only first replaced)" + logger.info( + TOOL_FS_EDIT, + path=user_path, + occurrences_found=count, + occurrences_replaced=1, + ) + return ToolExecutionResult( + content=msg, + metadata={ + "path": user_path, + "occurrences_found": count, + "occurrences_replaced": 1, + }, + ) + async def _perform_edit( self, user_path: str, resolved: Path, old_text: str, new_text: str, ) -> ToolExecutionResult: """Run the edit and return the result.""" try: count = await asyncio.to_thread( _edit_sync, resolved, old_text, new_text, ) except UnicodeDecodeError: logger.warning(TOOL_FS_ERROR, path=user_path, error="binary") return ToolExecutionResult( content=f"Cannot edit binary file: {user_path}", is_error=True, ) except (FileNotFoundError, IsADirectoryError, PermissionError, OSError) as exc: log_key, msg = _map_os_error(exc, user_path, "editing") logger.warning(TOOL_FS_ERROR, path=user_path, error=log_key) return ToolExecutionResult(content=msg, is_error=True) - if count == 0: - logger.warning( - TOOL_FS_EDIT_NOT_FOUND, - path=user_path, - old_text_len=len(old_text), - ) - return ToolExecutionResult( - content=f"Text not found in {user_path}.", - is_error=True, - metadata={ - "path": user_path, - "occurrences_found": 0, - "occurrences_replaced": 0, - }, - ) - msg = f"Replaced 1 occurrence in {user_path}" - if count > 1: - msg += f" (warning: {count} total occurrences found, only first replaced)" - logger.info( - TOOL_FS_EDIT, - path=user_path, - occurrences_found=count, - occurrences_replaced=1, - ) - return ToolExecutionResult( - content=msg, - metadata={ - "path": user_path, - "occurrences_found": count, - "occurrences_replaced": 1, - }, - ) + return self._build_edit_result(user_path, count, old_text)As per coding guidelines: "Functions must be less than 50 lines".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/tools/file_system/edit_file.py` around lines 193 - 249, The method _perform_edit exceeds the 50-line limit; extract the result-building and logging block that handles count==0 vs count>0 into a new helper (e.g. _build_edit_result(user_path: str, count: int, old_text: str) -> ToolExecutionResult) so _perform_edit only performs the edit and delegates construction of the ToolExecutionResult and related logger calls (TOOL_FS_EDIT_NOT_FOUND, TOOL_FS_EDIT) to that helper; ensure the helper returns the same content, metadata keys ("path","occurrences_found","occurrences_replaced") and logging behavior so callers of _perform_edit still receive identical results.src/ai_company/tools/file_system/write_file.py (1)
119-140:⚠️ Potential issue | 🟠 MajorFix the size validation to account for text-mode newline expansion on Windows.
Line 125 counts UTF-8 bytes, but
_write_sync()writes in text mode (line 55:os.fdopen(fd, "w", encoding="utf-8")). On Windows, text mode expands\nto\r\n, so content passing the preflight check can exceedMAX_WRITE_SIZE_BYTESwhen persisted, defeating the size limit. Use binary mode ("wb") or disable newline translation (newline="") to ensure the preflight check matches the actual file size.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/tools/file_system/write_file.py` around lines 119 - 140, The size preflight in _validate_write_args measures UTF-8 bytes but _write_sync opens files in text mode which expands newlines on Windows; update _write_sync to open the file without newline translation (e.g., use os.fdopen(fd, "w", encoding="utf-8", newline="") or switch to binary mode "wb") so the actual bytes written will match the UTF-8 byte count used by _validate_write_args and keep MAX_WRITE_SIZE_BYTES enforcement consistent; verify references to _validate_write_args, _write_sync, and MAX_WRITE_SIZE_BYTES when making the change.src/ai_company/templates/loader.py (1)
169-176:⚠️ Potential issue | 🔴 CriticalPath traversal check fails to detect Windows-style separators on Unix.
The path traversal validation uses
Path(name_clean).namewhich relies on OS-specific path parsing. On Unix systems, backslashes (\) are valid filename characters, not path separators. SoPath("..\etc\passwd").namereturns"..\etc\passwd"unchanged, allowing the check to pass.This explains the pipeline failure: the test passes
'..\etc\passwd'expecting rejection, but the code proceeds to the "Unknown template" error at Line 206 instead of the "Invalid template name" error at Line 172.🐛 Proposed fix to handle both Unix and Windows path separators
# Sanitize to prevent path traversal. - safe_name = Path(name_clean).name - if safe_name != name_clean: + if "/" in name_clean or "\\" in name_clean or ".." in name_clean: msg = f"Invalid template name {name!r}: must not contain path separators" raise TemplateNotFoundError( msg, locations=(ConfigLocation(file_path=f"<template:{name}>"),), ) + safe_name = name_clean🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/templates/loader.py` around lines 169 - 176, The current path-traversal check using Path(name_clean).name is OS-dependent and misses Windows separators; update the validation in loader.py to explicitly reject any template names containing either '/' or '\\' before further processing: check name_clean for these characters and raise the existing TemplateNotFoundError (with the same ConfigLocation(file_path=f"<template:{name}>")) when found, rather than relying solely on Path(name_clean).name/safe_name; keep the existing error message and use the same variables (name_clean, safe_name, name, TemplateNotFoundError, ConfigLocation) so behavior and signatures remain consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/hookify.function-length.md:
- Around line 9-12: The current matcher (field: new_text, operator: regex_match,
value) only matches "def" at column 0 or four spaces and ignores async functions
and deeper indentation; update the regex in the value to accept arbitrary
leading whitespace and an optional "async" token (e.g., match patterns like
"^\s*(?:async\s+)?def\s") so nested and async function definitions trigger the
rule and enforce the "Functions must be less than 50 lines" reminder
consistently.
In @.claude/hookify.missing-logger.md:
- Around line 27-32: The hook's conditions on field `new_text` are too narrow:
they only match top-level "def " or "class " and only search the diff hunk for
the literal "get_logger", causing false positives when a module already
imports/initializes logger or uses "async def" or indented members. Update the
rule to (1) match function/class declarations including leading whitespace and
optional "async" (e.g., regex that accepts optional leading whitespace and
"async "), and (2) check the entire file content (or a broader context) for the
presence of both the import symbol "from ai_company.observability import
get_logger" and the initialization pattern "logger = get_logger(__name__)"
instead of only scanning the diff hunk; ensure the operator uses a file-level
search or adds conditions that verify both symbols are present to suppress the
hook when a module is already compliant.
In @.claude/hookify.pep758-except.md:
- Around line 5-11: The rule in .claude/hookify.pep758-except.md currently flags
any parenthesized except via the "pattern: except\s*\(" entry, which wrongly
warns valid "except (A, B) as exc:" forms; update the pattern so it only warns
when a parenthesized except is NOT followed by an "as" clause (i.e., exclude
cases where the closing parenthesis is immediately followed by "as" and an
identifier), keeping the existing warning/action behavior and preserving the
note about the allowed "except (A, B) as exc:" form.
In `@DESIGN_SPEC.md`:
- Line 2472: Update the table row for Crash recovery to avoid overstating
automatic reassignment: change the description for M3/FailAndReassignStrategy so
it says the engine will "mark FAILED / eligible for reassignment" (or similar)
rather than asserting automated reassignment is performed; keep references to
the pluggable RecoveryStrategy and FailAndReassignStrategy, and preserve the
note about model_copy and future CheckpointStrategy/AgentContext checkpointing
so the wording aligns with §6.6.
In `@src/ai_company/engine/agent_engine.py`:
- Around line 229-242: When post-processing fails after
_run_loop_with_timeout(), the fallback path currently rebuilds from the stale
pre-loop ctx from run(), losing the true final state in
execution_result.context; update the error handling in the blocks that call
_post_execution_pipeline and _build_and_log_result (including the other
occurrences around the second and third locations) to detect a failed
post-processing and pass the final execution_result.context (or collapse into
execution_result) into _handle_fatal_error (or into whatever fallback builder
currently uses ctx) so the fallback rebuild uses execution_result.context
(conversation, turns, accumulated_cost, task state) rather than the pre-loop
ctx; ensure _handle_fatal_error signature or the fallback builder is adjusted to
accept and prefer execution_result.context when present.
In `@src/ai_company/engine/react_loop.py`:
- Around line 82-88: The loop condition is checking the bound method object
instead of its boolean result; change the while condition to call the method
(use ctx.has_turns_remaining()) so the loop respects the AgentContext max turns.
Locate the while loop in react_loop.py (after _prepare_loop and before
_check_budget) and replace the current bare reference to has_turns_remaining
with a call to has_turns_remaining() on ctx.
In `@src/ai_company/observability/correlation.py`:
- Around line 19-25: Replace the private import of get_logger with the public
package import: stop importing from ai_company.observability._logger and instead
import get_logger from ai_company.observability, then keep the existing logger =
get_logger(__name__) line; update the import statement that currently references
ai_company.observability._logger to use ai_company.observability and ensure
CORRELATION_ASYNC_DECORATOR_MISUSE and CORRELATION_SYNC_DECORATOR_MISUSE imports
remain unchanged.
In `@src/ai_company/providers/registry.py`:
- Around line 178-194: Add logging calls before raising in the error branches of
_build_driver and the non-callable branch in _resolve_factory: log the exception
when factory(name, config) fails (use logger.error or logger.warning with
context including provider name, driver_type and exc) and log a warning/error
when the produced driver is not an instance of BaseCompletionProvider (include
provider and driver_type). Also log when factory is not callable in
_resolve_factory with the same contextual fields. Ensure you reference the same
variables (driver_type, name, config, factory, driver) and keep the raise of
DriverFactoryNotFoundError unchanged after the log.
- Around line 178-194: The code mistakenly raises DriverFactoryNotFoundError for
runtime factory failures and wrong return types; change these paths to raise a
distinct error (e.g., DriverFactoryBrokenError or ProviderInitializationError)
instead of DriverFactoryNotFoundError, preserve the original exception as the
__cause__ (use "from exc") and include the original exception text in the
context, and do the same for the other occurrence around the
factory-result/type-check block so callers can distinguish "missing factory"
(DriverFactoryNotFoundError) from "factory exists but is broken/returned wrong
type" (new error) while still referencing DriverFactoryNotFoundError, factory,
driver, and BaseCompletionProvider to locate the code to change.
In `@src/ai_company/templates/renderer.py`:
- Around line 288-299: The _validate_list helper currently only checks that
rendered_data[key] is a list but not that each element is a dict, which leads to
AttributeError later in _expand_single_agent and _build_departments; update
_validate_list to iterate over the returned list and ensure each item is a dict
(or raise TemplateRenderError with a clear message mentioning the key and the
offending index/type) so callers like _expand_single_agent and
_build_departments receive validated dict items and get a clean
TemplateRenderError on bad input.
- Around line 369-380: The code currently catches a KeyError from
get_personality_preset(preset_name) and only logs a warning, allowing a template
with an unknown personality_preset to proceed silently; change this to reject
the invalid preset by re-raising a clear exception (or raising a ValueError)
after logging so template rendering fails fast. Locate the block using
agent.get("personality_preset"), get_personality_preset, and
agent_dict["personality"], keep or log the TEMPLATE_PERSONALITY_PRESET_UNKNOWN
diagnostic with preset_name/agent_name/available/error, and then raise an
exception (including the preset_name and agent name) instead of swallowing the
KeyError so callers know the template is invalid.
In `@src/ai_company/tools/file_system/list_directory.py`:
- Around line 50-65: The output prefixes are misaligned because entry.is_dir()
returns "[DIR] {display}/" with two spaces while entry.is_symlink() and the
file branch use a single space; update the directory branch (the return in the
entry.is_dir() check) to use a single space like the others (e.g., "[DIR]
{display}/") so the prefixes for entry.is_symlink(), entry.is_dir(), and the
file return lines are consistently formatted.
In `@tests/unit/core/test_role_catalog.py`:
- Around line 252-265: The two new tests test_no_duplicate_seniority_levels and
test_no_missing_seniority_levels in TestCatalogGuards duplicate existing
assertions in TestSeniorityInfo (test_no_duplicate_levels and
test_covers_all_seniority_levels); remove these duplicate methods from
TestCatalogGuards to avoid redundancy, or alternatively replace them with
guard-specific tests that mock/patch SENIORITY_INFO (or the lookup function used
by TestCatalogGuards such as the mapping access) to assert the guard behavior
(like test_raises_lookup_error_for_missing_level) rather than re-checking the
same set/coverage logic.
In `@tests/unit/core/test_task_transitions.py`:
- Around line 151-158: The test test_module_level_guard_detects_missing_status
does not exercise the module-level import-time guard because it runs after
import; either remove this duplicate completeness check, or replace it with an
isolated import-time test that actually triggers the guard: write a new test
that spawns a fresh Python interpreter (e.g., via subprocess.run) which imports
the module that defines TaskStatus and VALID_TRANSITIONS and assert that the
import fails with ValueError (or that stderr contains the guard message); this
ensures the module-level guard is exercised at import time rather than checking
sets post-import. Include references to TaskStatus, VALID_TRANSITIONS and the
test name when making the change.
In `@tests/unit/core/test_task.py`:
- Around line 266-274: The test test_non_active_allows_assigned_to currently
only checks task.assigned_to; update it to also assert that the constructed task
retains the provided status by adding an assertion like assert task.status ==
status inside the same test (locate the test_non_active_allows_assigned_to
function and the _make_task call that uses TaskStatus values), ensuring the
parametrized TaskStatus cases (BLOCKED, CANCELLED, FAILED) are verified for both
assigned_to and status.
In `@tests/unit/engine/test_prompt_template.py`:
- Around line 44-50: The test currently recomputes the missing levels locally
instead of exercising the real guard in ai_company.engine.prompt_template;
change the test to invoke the actual guard path by either (A) importing and
calling a guard helper (e.g., validate_autonomy_instructions) from
ai_company.engine.prompt_template with the mutated AUTONOMY_INSTRUCTIONS and
asserting it raises the expected exception, or (B) monkeypatching
ai_company.engine.prompt_template.AUTONOMY_INSTRUCTIONS to the incomplete dict
and using importlib.reload(prompt_template) to trigger the module-level guard,
then assert the guard reports the removed SeniorityLevel; reference
AUTONOMY_INSTRUCTIONS, SeniorityLevel, ai_company.engine.prompt_template, and
importlib.reload or validate_autonomy_instructions to locate the code.
In `@tests/unit/observability/test_sinks.py`:
- Around line 205-219: The test test_file_open_oserror_raises_runtime_error
currently patches logging.handlers.RotatingFileHandler.__init__ which prevents
the handler object from being constructed and is brittle; change the mock to
target the actual file-open call (e.g., patch logging.FileHandler.__init__ or
the built-in open used by logging) so that _build_file_handler executes its
normal constructor path and raises OSError at the file-open step, ensuring the
RuntimeError path in _build_file_handler/SinkConfig handling is exercised
reliably.
In `@tests/unit/templates/test_loader.py`:
- Around line 252-260: The test test_defective_builtin_skipped currently only
checks type and can miss builtin entries; update it to assert that builtins were
actually dropped by either asserting templates == () (since no user templates
exist) or iterating templates from list_templates() and asserting no item has
source == "builtin"; keep the same patch of _load_builtin raising
TemplateRenderError and reference list_templates and _load_builtin in the
updated assertion to ensure the defective built-in was skipped.
- Around line 274-278: The Windows-style backslash case slips past the current
Path(name_clean).name normalization on POSIX; update load_template to explicitly
reject any template name containing either "/" or "\\" before using
Path(name_clean).name (e.g., check if "/" in name_clean or "\\" in name_clean
and raise TemplateNotFoundError("must not contain path separators")), or
alternatively normalize backslashes to the native separator first; ensure the
check references the existing name_clean variable and the load_template function
so the traversal rejection triggers consistently across platforms.
In `@tests/unit/tools/file_system/test_base_fs_tool.py`:
- Around line 37-44: The public test function test_map_os_error is missing the
required Google-style docstring; add a concise docstring above test_map_os_error
that describes the test purpose (verifies _map_os_error maps an OSError to the
expected key and message fragment), and include Google-style Args for exc,
expected_key, and expected_msg_fragment and an Optional Returns/Raises line if
applicable so it satisfies the ruff D rule.
In `@tests/unit/tools/file_system/test_write_file.py`:
- Around line 149-163: The test patches
ai_company.tools.file_system.write_file._write_sync which throws before
mkstemp() runs, so it never exercises the temp-file cleanup/rollback path;
update the test to force failure after the temp file is created by patching a
symbol executed after mkstemp(), e.g. patch pathlib.Path.replace or os.fsync (or
the replace call inside write_file) to raise OSError during the commit/rename
step while still calling write_tool.execute(arguments={"path":
"fail_target.txt", "content": "new"}), then assert no .tmp files remain and the
original file content is unchanged to verify cleanup.
---
Outside diff comments:
In `@src/ai_company/templates/loader.py`:
- Around line 169-176: The current path-traversal check using
Path(name_clean).name is OS-dependent and misses Windows separators; update the
validation in loader.py to explicitly reject any template names containing
either '/' or '\\' before further processing: check name_clean for these
characters and raise the existing TemplateNotFoundError (with the same
ConfigLocation(file_path=f"<template:{name}>")) when found, rather than relying
solely on Path(name_clean).name/safe_name; keep the existing error message and
use the same variables (name_clean, safe_name, name, TemplateNotFoundError,
ConfigLocation) so behavior and signatures remain consistent.
In `@src/ai_company/tools/file_system/edit_file.py`:
- Around line 193-249: The method _perform_edit exceeds the 50-line limit;
extract the result-building and logging block that handles count==0 vs count>0
into a new helper (e.g. _build_edit_result(user_path: str, count: int, old_text:
str) -> ToolExecutionResult) so _perform_edit only performs the edit and
delegates construction of the ToolExecutionResult and related logger calls
(TOOL_FS_EDIT_NOT_FOUND, TOOL_FS_EDIT) to that helper; ensure the helper returns
the same content, metadata keys
("path","occurrences_found","occurrences_replaced") and logging behavior so
callers of _perform_edit still receive identical results.
In `@src/ai_company/tools/file_system/write_file.py`:
- Around line 119-140: The size preflight in _validate_write_args measures UTF-8
bytes but _write_sync opens files in text mode which expands newlines on
Windows; update _write_sync to open the file without newline translation (e.g.,
use os.fdopen(fd, "w", encoding="utf-8", newline="") or switch to binary mode
"wb") so the actual bytes written will match the UTF-8 byte count used by
_validate_write_args and keep MAX_WRITE_SIZE_BYTES enforcement consistent;
verify references to _validate_write_args, _write_sync, and MAX_WRITE_SIZE_BYTES
when making the change.
In `@tests/unit/tools/git/test_git_tools.py`:
- Around line 56-69: The test test_symlink_escape_blocked creates outside =
git_repo.parent / "outside_dir" and calls outside.mkdir() which can raise
FileExistsError in parallel runs; change the mkdir call to use exist_ok=True
(i.e., outside.mkdir(exist_ok=True)) or generate a unique temp dir (e.g., via
tmp_path_factory or tempfile.mkdtemp()) so the setup for outside_dir is robust
under pytest-xdist, leaving the rest of the test (link.symlink_to and invoking
GitDiffTool.execute) unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: a25c05c5-cc72-4677-a883-83a6a7c4c507
📒 Files selected for processing (49)
.claude/hookify.function-length.md.claude/hookify.missing-logger.md.claude/hookify.pep758-except.md.pre-commit-config.yamlCLAUDE.mdDESIGN_SPEC.mdsrc/ai_company/budget/tracker.pysrc/ai_company/config/loader.pysrc/ai_company/config/utils.pysrc/ai_company/engine/agent_engine.pysrc/ai_company/engine/prompt.pysrc/ai_company/engine/react_loop.pysrc/ai_company/engine/task_execution.pysrc/ai_company/observability/correlation.pysrc/ai_company/observability/events/config.pysrc/ai_company/observability/events/correlation.pysrc/ai_company/providers/base.pysrc/ai_company/providers/registry.pysrc/ai_company/providers/resilience/retry.pysrc/ai_company/providers/routing/_strategy_helpers.pysrc/ai_company/templates/loader.pysrc/ai_company/templates/presets.pysrc/ai_company/templates/renderer.pysrc/ai_company/templates/schema.pysrc/ai_company/tools/_git_base.pysrc/ai_company/tools/file_system/delete_file.pysrc/ai_company/tools/file_system/edit_file.pysrc/ai_company/tools/file_system/list_directory.pysrc/ai_company/tools/file_system/read_file.pysrc/ai_company/tools/file_system/write_file.pysrc/ai_company/tools/git_tools.pysrc/ai_company/tools/invoker.pytests/unit/core/test_role_catalog.pytests/unit/core/test_task.pytests/unit/core/test_task_transitions.pytests/unit/engine/test_agent_engine.pytests/unit/engine/test_agent_engine_errors.pytests/unit/engine/test_prompt_template.pytests/unit/observability/test_events.pytests/unit/observability/test_sinks.pytests/unit/templates/test_errors.pytests/unit/templates/test_loader.pytests/unit/templates/test_renderer.pytests/unit/templates/test_schema.pytests/unit/tools/file_system/test_base_fs_tool.pytests/unit/tools/file_system/test_list_directory.pytests/unit/tools/file_system/test_write_file.pytests/unit/tools/git/test_git_tools.pytests/unit/tools/test_permissions.py
| - field: new_text | ||
| operator: regex_match | ||
| value: "^(def |class )" | ||
| - field: new_text | ||
| operator: not_contains | ||
| value: "get_logger" |
There was a problem hiding this comment.
new_text makes this hook false-positive on already compliant modules.
The last condition only checks the diff hunk for get_logger, so editing a function in a file that already imports and initializes logger will still warn unless those top-of-file lines are part of the same patch. It also misses async def and indented members for the same reason. As per coding guidelines "Every module with business logic MUST import: from ai_company.observability import get_logger then logger = get_logger(name)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.claude/hookify.missing-logger.md around lines 27 - 32, The hook's
conditions on field `new_text` are too narrow: they only match top-level "def "
or "class " and only search the diff hunk for the literal "get_logger", causing
false positives when a module already imports/initializes logger or uses "async
def" or indented members. Update the rule to (1) match function/class
declarations including leading whitespace and optional "async" (e.g., regex that
accepts optional leading whitespace and "async "), and (2) check the entire file
content (or a broader context) for the presence of both the import symbol "from
ai_company.observability import get_logger" and the initialization pattern
"logger = get_logger(__name__)" instead of only scanning the diff hunk; ensure
the operator uses a file-level search or adds conditions that verify both
symbols are present to suppress the hook when a module is already compliant.
| execution_result = await self._post_execution_pipeline( | ||
| execution_result, | ||
| identity, | ||
| agent_id, | ||
| task_id, | ||
| ) | ||
|
|
||
| return self._build_and_log_result( | ||
| execution_result, | ||
| system_prompt, | ||
| start, | ||
| agent_id, | ||
| task_id, | ||
| ) |
There was a problem hiding this comment.
Preserve the loop’s final context when post-processing fails.
Once _run_loop_with_timeout() returns, failures in _post_execution_pipeline() or _build_and_log_result() still go through _handle_fatal_error(), but that path can only rebuild from the stale pre-loop ctx from run(). The fallback ExecutionResult therefore loses the real final conversation, turns, accumulated_cost, and task state from execution_result.context.
🧯 Proposed fix
- execution_result = await self._post_execution_pipeline(
- execution_result,
- identity,
- agent_id,
- task_id,
- )
-
- return self._build_and_log_result(
- execution_result,
- system_prompt,
- start,
- agent_id,
- task_id,
- )
+ try:
+ execution_result = await self._post_execution_pipeline(
+ execution_result,
+ identity,
+ agent_id,
+ task_id,
+ )
+ return self._build_and_log_result(
+ execution_result,
+ system_prompt,
+ start,
+ agent_id,
+ task_id,
+ )
+ except MemoryError, RecursionError:
+ raise
+ except Exception as exc:
+ return await self._handle_fatal_error(
+ exc=exc,
+ identity=identity,
+ task=task,
+ agent_id=agent_id,
+ task_id=task_id,
+ duration_seconds=time.monotonic() - start,
+ ctx=execution_result.context,
+ execution_result=execution_result,
+ system_prompt=system_prompt,
+ ) async def _handle_fatal_error(
self,
*,
exc: Exception,
identity: AgentIdentity,
task: Task,
agent_id: str,
task_id: str,
duration_seconds: float,
ctx: AgentContext | None = None,
+ execution_result: ExecutionResult | None = None,
system_prompt: SystemPrompt | None = None,
) -> AgentRunResult:
@@
error_execution = await self._build_error_execution(
identity,
task,
agent_id,
task_id,
error_msg,
ctx,
+ execution_result,
)
@@
async def _build_error_execution(
self,
identity: AgentIdentity,
task: Task,
agent_id: str,
task_id: str,
error_msg: str,
ctx: AgentContext | None,
+ execution_result: ExecutionResult | None,
) -> ExecutionResult:
"""Create an error ``ExecutionResult`` and apply recovery."""
- error_ctx = ctx or AgentContext.from_identity(identity, task=task)
+ error_ctx = (
+ execution_result.context
+ if execution_result is not None
+ else ctx or AgentContext.from_identity(identity, task=task)
+ )Also applies to: 685-698, 725-745
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/engine/agent_engine.py` around lines 229 - 242, When
post-processing fails after _run_loop_with_timeout(), the fallback path
currently rebuilds from the stale pre-loop ctx from run(), losing the true final
state in execution_result.context; update the error handling in the blocks that
call _post_execution_pipeline and _build_and_log_result (including the other
occurrences around the second and third locations) to detect a failed
post-processing and pass the final execution_result.context (or collapse into
execution_result) into _handle_fatal_error (or into whatever fallback builder
currently uses ctx) so the fallback rebuild uses execution_result.context
(conversation, turns, accumulated_cost, task state) rather than the pre-loop
ctx; ensure _handle_fatal_error signature or the fallback builder is adjusted to
accept and prefer execution_result.context when present.
| def test_file_open_oserror_raises_runtime_error(self, tmp_path: Path) -> None: | ||
| """File open OSError is wrapped in RuntimeError.""" | ||
| sink = SinkConfig( | ||
| sink_type=SinkType.FILE, | ||
| file_path="app.log", | ||
| rotation=RotationConfig(strategy=RotationStrategy.BUILTIN), | ||
| ) | ||
| with ( | ||
| patch( | ||
| "logging.handlers.RotatingFileHandler.__init__", | ||
| side_effect=OSError("disk full"), | ||
| ), | ||
| pytest.raises(RuntimeError, match="Failed to open log file"), | ||
| ): | ||
| _build_file_handler(sink, tmp_path) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Patching __init__ may not reliably simulate file-open failure.
The RotatingFileHandler.__init__ internally calls BaseRotatingHandler.__init__, which then calls logging.FileHandler.__init__ to open the file. Patching __init__ with side_effect=OSError prevents the object from being constructed at all, which happens to raise OSError before entering the implementation's try block.
This works by accident — the OSError is raised during handler construction, which the implementation catches. However, this approach is fragile. A more precise approach would be to patch the actual file-opening call.
♻️ Optional: More precise mock target
def test_file_open_oserror_raises_runtime_error(self, tmp_path: Path) -> None:
"""File open OSError is wrapped in RuntimeError."""
sink = SinkConfig(
sink_type=SinkType.FILE,
file_path="app.log",
rotation=RotationConfig(strategy=RotationStrategy.BUILTIN),
)
with (
patch(
- "logging.handlers.RotatingFileHandler.__init__",
+ "logging.handlers.RotatingFileHandler",
side_effect=OSError("disk full"),
),
pytest.raises(RuntimeError, match="Failed to open log file"),
):
_build_file_handler(sink, tmp_path)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_file_open_oserror_raises_runtime_error(self, tmp_path: Path) -> None: | |
| """File open OSError is wrapped in RuntimeError.""" | |
| sink = SinkConfig( | |
| sink_type=SinkType.FILE, | |
| file_path="app.log", | |
| rotation=RotationConfig(strategy=RotationStrategy.BUILTIN), | |
| ) | |
| with ( | |
| patch( | |
| "logging.handlers.RotatingFileHandler.__init__", | |
| side_effect=OSError("disk full"), | |
| ), | |
| pytest.raises(RuntimeError, match="Failed to open log file"), | |
| ): | |
| _build_file_handler(sink, tmp_path) | |
| def test_file_open_oserror_raises_runtime_error(self, tmp_path: Path) -> None: | |
| """File open OSError is wrapped in RuntimeError.""" | |
| sink = SinkConfig( | |
| sink_type=SinkType.FILE, | |
| file_path="app.log", | |
| rotation=RotationConfig(strategy=RotationStrategy.BUILTIN), | |
| ) | |
| with ( | |
| patch( | |
| "logging.handlers.RotatingFileHandler", | |
| side_effect=OSError("disk full"), | |
| ), | |
| pytest.raises(RuntimeError, match="Failed to open log file"), | |
| ): | |
| _build_file_handler(sink, tmp_path) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/observability/test_sinks.py` around lines 205 - 219, The test
test_file_open_oserror_raises_runtime_error currently patches
logging.handlers.RotatingFileHandler.__init__ which prevents the handler object
from being constructed and is brittle; change the mock to target the actual
file-open call (e.g., patch logging.FileHandler.__init__ or the built-in open
used by logging) so that _build_file_handler executes its normal constructor
path and raises OSError at the file-open step, ensuring the RuntimeError path in
_build_file_handler/SinkConfig handling is exercised reliably.
| def test_map_os_error( | ||
| exc: OSError, | ||
| expected_key: str, | ||
| expected_msg_fragment: str, | ||
| ) -> None: | ||
| key, msg = _map_os_error(exc, "/workspace/missing.txt", "reading") | ||
| assert key == expected_key | ||
| assert msg == expected_msg_fragment |
There was a problem hiding this comment.
Add the required docstring to the public test function.
test_map_os_error is typed but still missing the docstring required by repo rules.
✏️ Proposed fix
def test_map_os_error(
exc: OSError,
expected_key: str,
expected_msg_fragment: str,
) -> None:
+ """Verifies `_map_os_error` returns expected mappings for common OS errors."""
key, msg = _map_os_error(exc, "/workspace/missing.txt", "reading")
assert key == expected_key
assert msg == expected_msg_fragment🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/tools/file_system/test_base_fs_tool.py` around lines 37 - 44, The
public test function test_map_os_error is missing the required Google-style
docstring; add a concise docstring above test_map_os_error that describes the
test purpose (verifies _map_os_error maps an OSError to the expected key and
message fragment), and include Google-style Args for exc, expected_key, and
expected_msg_fragment and an Optional Returns/Raises line if applicable so it
satisfies the ruff D rule.
…, Greptile, Copilot Critical: - Fix OS-dependent path traversal check in loader.py (explicit / \ .. rejection) - Fix symlink test FileExistsError in parallel xdist runs (exist_ok=True) Major: - Widen credential regex to catch token-only URLs (https://<token>@host) - Remove duplicate TEMPLATE_PERSONALITY_PRESET_UNKNOWN log from presets.py - Raise TemplateRenderError on unknown personality preset (was silent fallthrough) - Validate list item types in _validate_list (reject non-dict entries) - Fix hookify rules: async def support, file-level logger check, exclude as-clause - Add pytestmark timeout(30) to test_agent_engine.py and test_loader.py - Import get_logger from public package, not private _logger module Medium: - Upgrade tags to tuple[NotBlankStr, ...] per convention - Fix [DIR] double space formatting in list_directory.py - Soften DESIGN_SPEC.md crash recovery wording (eligible for reassignment) - Add logging before raises in registry.py _build_driver/_resolve_factory - Fix test_write_file to patch os.fsync (actually tests temp-file cleanup) - Fix max_bytes docstring in read_file.py (reads characters, not bytes) Minor: - Remove duplicate seniority guard tests from test_role_catalog.py - Add status assertion to test_non_active_allows_assigned_to - Strengthen defective builtin test assertion (== () not isinstance) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| "path": user_path, | ||
| "total_entries": total, |
There was a problem hiding this comment.
[DIR] output format changed from double-space to single-space alignment
The original code used f"[DIR] {display}/" (two spaces) intentionally to align directory and file entries: [DIR] (5 chars) + two spaces = 7 chars total, matching [FILE] (6 chars) + one space = 7 chars total. This kept the entry name aligned in a column regardless of prefix type.
The new helper returns f"[DIR] {display}/" (one space), breaking that alignment. Since list_directory's output is typically user-facing and may be parsed by agent logic or tests expecting the old format, this is a subtle format change worth addressing.
| "path": user_path, | |
| "total_entries": total, | |
| return f"[DIR] {display}/" |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/tools/file_system/list_directory.py
Line: 249-250
Comment:
**`[DIR]` output format changed from double-space to single-space alignment**
The original code used `f"[DIR] {display}/"` (two spaces) intentionally to align directory and file entries: `[DIR]` (5 chars) + two spaces = 7 chars total, matching `[FILE]` (6 chars) + one space = 7 chars total. This kept the entry name aligned in a column regardless of prefix type.
The new helper returns `f"[DIR] {display}/"` (one space), breaking that alignment. Since `list_directory`'s output is typically user-facing and may be parsed by agent logic or tests expecting the old format, this is a subtle format change worth addressing.
```suggestion
return f"[DIR] {display}/"
```
How can I resolve this? If you propose a fix, please make it concise.|
|
||
| if execution_result.termination_reason == TerminationReason.ERROR: | ||
| execution_result = await self._apply_recovery( | ||
| execution_result, |
There was a problem hiding this comment.
raise exc from None suppresses the exception chain
Changing from raise exc from build_exc to raise exc from None clears __cause__ on the re-raised exception. While build_exc is logged on the preceding line so the information isn't completely lost, any upstream handler that inspects exc.__cause__ or reads a chained traceback will no longer see the build_exc context. Suppressing the chain deliberately with from None is typically reserved for cases where you want to present a cleaner, unrelated exception — here build_exc is directly related to why we're in this path.
If the goal is to re-raise the original exc without implying build_exc caused it, raise exc (without any from) would re-attach build_exc as __context__ while not implying causation, preserving the chain without the misleading from semantics.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/engine/agent_engine.py
Line: 266
Comment:
**`raise exc from None` suppresses the exception chain**
Changing from `raise exc from build_exc` to `raise exc from None` clears `__cause__` on the re-raised exception. While `build_exc` is logged on the preceding line so the information isn't completely lost, any upstream handler that inspects `exc.__cause__` or reads a chained traceback will no longer see the `build_exc` context. Suppressing the chain deliberately with `from None` is typically reserved for cases where you want to present a cleaner, unrelated exception — here `build_exc` is directly related to why we're in this path.
If the goal is to re-raise the original `exc` without implying `build_exc` *caused* it, `raise exc` (without any `from`) would re-attach `build_exc` as `__context__` while not implying causation, preserving the chain without the misleading `from` semantics.
How can I resolve this? If you propose a fix, please make it concise.CI fixes: - Remove unused type: ignore on Linux (use dual suppression for cross-platform compat) in subprocess_sandbox.py - Fix test assertion for raise-from-build_exc chaining in test_agent_engine_errors.py Review feedback (Greptile): - Rename _check_ref → _check_git_arg with updated docstring (was misleadingly named when used for author/date validation) - Fix missing space in clone URL error message - Add EXECUTION_SHUTDOWN_CLEANUP_FAILED event constant (was reusing EXECUTION_SHUTDOWN_CLEANUP for both start and failure) - Capture partial stderr on timeout in direct subprocess path (mirrors sandbox behavior for better debuggability) - Make Windows Git PATH prefixes configurable via SubprocessSandboxConfig.extra_safe_path_prefixes - Add clarifying comment for total_entries semantics when truncated - Fix ResourceWarning on Windows by explicitly closing subprocess transports after communicate() Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Comprehensive post-audit cleanup across 49 files (+2195/-1099 lines):
except (A, B):toexcept A, B:throughout (where noasbinding)logger = get_logger(__name__)toconfig/utils.py,observability/correlation.py,templates/presets.py+ newevents/correlation.pyconstants..),ref2withoutref1in GitDiffTool, unknown branch action fallback, credential sanitization in git command logging,**withoutrecursive=Trueguard@pytest.mark.parametrize, new edge-case coverage (template errors, file system helpers, git tools, guard logic), 3 new test filesfunction-length,missing-logger,pep758-exceptprevention rules--author/--since/--untilin GitLogTool, credential redaction in git loggingTemplateAgentConfig.modelandCompanyTemplate.workflow/communicationupgraded toNotBlankStrevents/correlation.pyto §15.3, crash recovery status → "Adopted" in §15.5)list_directorytotal_entriesnow reflects pre-truncation count withtruncatedflagCloses #148
Pre-PR Review
Pre-reviewed by 10 specialized agents in parallel:
Findings: 16 total (4 MAJOR, 8 MEDIUM, 4 MINOR) → 13 implemented, 3 skipped (intentional design choices)
Test Plan
uv run ruff check src/ tests/— all checks passeduv run ruff format src/ tests/— all formatteduv run mypy src/ tests/— no issues in 257 source filesuv run pytest tests/ -n auto --cov=ai_company --cov-fail-under=80— 2324 passed, 4 skipped, 96.58% coverage