feat: testing framework, CI pipeline, and M0 gap fixes#64
Conversation
Add warn_unreachable, show_error_codes, and enable additional error codes (ignore-without-code, redundant-cast, truthy-bool). Configure pydantic-mypy plugin with init_forbid_extra, init_typed, and warn_required_dynamic_aliases. Closes #25
Change both package ecosystems from daily to weekly (Monday) to reduce noise. Add auto-merge workflow that squash-merges minor/patch dependency updates automatically while requiring manual review for major versions. Closes #52
Add test dependency group (pytest, pytest-asyncio, pytest-cov, pytest-mock, pytest-timeout, pytest-xdist, polyfactory, respx). Configure pytest with custom markers (unit, integration, e2e, slow), strict mode, and 30s timeout. Add coverage config requiring 80% with branch coverage. Create conftest files with integration CI-skip fixture and smoke tests verifying package import, version format, and marker registration. Closes #35
Python 3.14+ has native lazy annotations (PEP 649), making the __future__ import unnecessary. Add banned-api rule to prevent it.
Add reusable composite action for Python+uv setup. Create CI workflow with lint (ruff check + format), type-check (mypy), and test (pytest with coverage + xdist) jobs gated by a ci-pass status check. Add dependency-review workflow for PR security scanning. Add CI badge to README. Closes #51
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the project's development infrastructure by integrating a comprehensive testing framework and a robust Continuous Integration (CI) pipeline. The changes aim to improve code quality, ensure type safety, automate dependency updates, and provide a reliable system for verifying code correctness and adherence to standards. This foundational work will enable more efficient development and maintenance by catching issues early and standardizing development practices. Highlights
Changelog
Ignored Files
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (7)
📝 WalkthroughWalkthroughThis PR establishes comprehensive testing infrastructure, CI/CD pipelines, and development tooling. It introduces pytest configuration with coverage tracking, GitHub Actions workflows for linting/type-checking/testing, automated dependency management via Dependabot, a reusable Python/uv setup action, and initial smoke tests. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive testing framework and CI pipeline, including pytest with coverage, strict mypy flags, and GitHub Actions workflows. However, a potential command injection vulnerability was identified in the newly added GitHub Action. Additionally, areas for improvement include pinning a GitHub Action version for reproducibility and addressing a misconfigured test fixture.
tests/conftest.py
Outdated
| @pytest.fixture(params=["asyncio"]) | ||
| def anyio_backend(request: pytest.FixtureRequest) -> str: | ||
| """Configure async backend for anyio-based tests.""" | ||
| return str(request.param) |
There was a problem hiding this comment.
This anyio_backend fixture is used to configure pytest-anyio. However, the project's dependencies in pyproject.toml include pytest-asyncio, not pytest-anyio. These are two different libraries for testing async code. Since the project is configured to use pytest-asyncio (as seen with the asyncio_mode setting), this fixture is unused and misleading. It should be removed to avoid confusion.
|
|
||
| - name: Install Python | ||
| shell: bash | ||
| run: uv python install ${{ inputs.python-version }} |
There was a problem hiding this comment.
The run step in the composite action uses ${{ inputs.python-version }} directly in a shell script. This is a potential command injection vulnerability if the input contains shell metacharacters. An attacker who can control the python-version input (e.g., through a dynamic workflow trigger or a malicious PR) could execute arbitrary commands in the CI runner. To remediate this, use environment variables to pass inputs to shell scripts. This ensures that the input is treated as a string and not as part of the command.
- name: Install Python
shell: bash
env:
PYTHON_VERSION: ${{ inputs.python-version }}
run: uv python install "$PYTHON_VERSION"| using: composite | ||
| steps: | ||
| - name: Install uv | ||
| uses: astral-sh/setup-uv@v6 |
There was a problem hiding this comment.
There was a problem hiding this comment.
Pull request overview
This PR establishes the testing framework, CI pipeline, and addresses M0 configuration gaps for type checking and dependency management. It introduces pytest with coverage tracking (80% threshold), custom test markers (unit/integration/e2e/slow), CI workflows for linting, type checking, and testing, and enhances mypy's strict mode configuration. The PR also changes Dependabot from daily to weekly updates and adds an auto-merge workflow for minor/patch dependency updates.
Changes:
- Added pytest testing framework with coverage (80% threshold), custom markers, and smoke tests
- Implemented CI pipeline with lint, type-check, test jobs, and reusable setup action
- Enhanced mypy configuration with additional strict flags (
warn_unreachable,show_error_codes,enable_error_code) and pydantic-mypy plugin settings - Configured Dependabot for weekly updates (Mondays) with auto-merge for minor/patch versions
- Banned
from __future__ import annotationsvia ruff (citing PEP 649 and Python 3.14+)
Reviewed changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
tests/unit/test_smoke.py |
Smoke tests verifying package import, version format, and marker registration |
tests/conftest.py |
Root test fixtures with anyio backend configuration |
tests/unit/conftest.py |
Empty unit test configuration placeholder |
tests/integration/conftest.py |
Integration test fixtures with CI skip logic (requires RUN_INTEGRATION_TESTS=1) |
tests/e2e/conftest.py |
Empty e2e test configuration placeholder |
pyproject.toml |
Added test dependencies, pytest/coverage config, mypy strict flags, pydantic-mypy settings, and banned-api rule |
README.md |
Added CI badge |
.github/dependabot.yml |
Changed schedule from daily to weekly (Monday) |
.github/workflows/ci.yml |
Main CI workflow with lint, type-check, test, and gate jobs |
.github/workflows/dependabot-auto-merge.yml |
Auto-merge workflow for non-major dependency updates |
.github/workflows/dependency-review.yml |
Dependency security scanning for PRs |
.github/actions/setup-python-uv/action.yml |
Reusable composite action for Python and uv setup |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| [tool.ruff.lint.flake8-tidy-imports.banned-api] | ||
| "__future__.annotations".msg = "Python 3.14+ has native lazy annotations (PEP 649). Do not use `from __future__ import annotations`." |
There was a problem hiding this comment.
The banned-api message claims "Python 3.14+ has native lazy annotations (PEP 649)" but this is misleading. While PEP 649 is scheduled for Python 3.14, the actual behavior and whether it's enabled by default is still being finalized. The message incorrectly implies that from __future__ import annotations is unnecessary on Python 3.14+. Additionally, if the project truly requires Python 3.14 minimum (which is problematic - see other comments), banning this import makes sense, but if the project should support 3.12+ per DESIGN_SPEC.md, this rule is premature and would prevent backwards-compatible code from being written.
tests/conftest.py
Outdated
|
|
||
| @pytest.fixture(params=["asyncio"]) | ||
| def anyio_backend(request: pytest.FixtureRequest) -> str: | ||
| """Configure async backend for anyio-based tests.""" | ||
| return str(request.param) |
There was a problem hiding this comment.
The anyio_backend fixture is defined but anyio is not listed as a dependency in the test or dev dependency groups. This fixture appears to be for pytest-anyio integration, but the pytest configuration uses asyncio_mode = "auto" which is a pytest-asyncio feature that doesn't require anyio. Either add pytest-anyio and anyio as dependencies if anyio support is intended, or remove this fixture if only asyncio support is needed.
| @pytest.fixture(params=["asyncio"]) | |
| def anyio_backend(request: pytest.FixtureRequest) -> str: | |
| """Configure async backend for anyio-based tests.""" | |
| return str(request.param) |
|
|
||
| - name: Auto-merge minor and patch updates | ||
| if: steps.metadata.outputs.update-type != 'version-update:semver-major' | ||
| run: gh pr merge --auto --squash "$PR_URL" |
There was a problem hiding this comment.
The Dependabot auto-merge workflow will auto-merge PRs without waiting for CI checks to pass. The workflow should include a condition to ensure required CI checks (lint, type-check, test) pass before merging. Consider adding a check that waits for the 'CI Pass' job to succeed, or configure branch protection rules to require status checks before merging.
| run: gh pr merge --auto --squash "$PR_URL" | |
| run: | | |
| # Wait for all required status checks to pass before merging | |
| gh pr checks "$PR_URL" --watch --required | |
| gh pr merge --auto --squash "$PR_URL" |
| runs-on: ubuntu-latest | ||
| strategy: | ||
| matrix: | ||
| python-version: ["3.14"] |
There was a problem hiding this comment.
The test matrix only includes Python 3.14, but the linked issue #51 mentions "Matrix testing (Python 3.12+, consider 3.13)" and DESIGN_SPEC.md specifies "Python 3.12+" as the minimum version. The matrix should include at least 3.12 and 3.13 to ensure compatibility across the supported version range. Testing only the cutting-edge 3.14 version (which may not be widely available yet) creates a risk of compatibility issues going undetected.
| python-version: ["3.14"] | |
| python-version: ["3.12", "3.13", "3.14"] |
| python-version: | ||
| description: Python version to install | ||
| required: false | ||
| default: "3.14" |
There was a problem hiding this comment.
The default Python version is set to 3.14, which is inconsistent with the DESIGN_SPEC.md requirement of Python 3.12+. Python 3.14 was likely just released or is still in preview/alpha stage (as of February 2026), making it an overly restrictive default that could cause CI failures or limit contributor adoption. Consider changing the default to a more stable version like 3.12 or 3.13, while still supporting 3.14 in the test matrix.
| default: "3.14" | |
| default: "3.12" |
| "slow: Slow tests (excluded from default runs)", | ||
| ] | ||
| addopts = ["--strict-markers", "--strict-config", "-ra", "--tb=short"] | ||
| filterwarnings = ["error"] |
There was a problem hiding this comment.
The pytest configuration sets filterwarnings = ["error"] which converts all warnings into errors. This is extremely strict and could cause test failures from third-party dependencies that emit deprecation warnings or other warnings. Consider using a more targeted approach like error::DeprecationWarning:ai_company.* to only treat warnings from the project's own code as errors, or at minimum add exceptions for known third-party warnings.
| filterwarnings = ["error"] | |
| filterwarnings = [ | |
| "error::DeprecationWarning:ai_company.*", | |
| "default", | |
| ] |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/actions/setup-python-uv/action.yml:
- Around line 20-24: The workflow installs a specific Python via the `uv python
install` step but then runs `uv sync --frozen` without specifying the
interpreter; update the `uv sync` invocation to include the `--python` flag and
pass the same version installed (mirror the value from `uv python install`) so
`uv sync --frozen --python <installed-version>` is used; target the `uv sync`
command in the action.yml to ensure the installed interpreter is pinned.
In @.github/workflows/dependabot-auto-merge.yml:
- Line 3: Update the workflow to grant the GITHUB_TOKEN write permissions so
Dependabot-triggered runs can perform merges: add a top-level permissions block
(e.g., permissions: pull-requests: write and contents: write) to the workflow
that uses on: pull_request; do not switch to pull_request_target. Optionally,
add the dependabot/fetch-metadata action in the job to filter Dependabot events
by update type (patch/minor) before running gh pr merge.
In `@pyproject.toml`:
- Around line 29-36: Replace the range specifiers in the test dependency list in
pyproject.toml with exact version pins: change "pytest>=8.4.1",
"pytest-asyncio>=1.0.0", "pytest-cov>=6.2.1", "pytest-mock>=3.14.0",
"pytest-timeout>=2.4.0", "pytest-xdist>=3.6.1", "polyfactory>=2.21.0", and
"respx>=0.22.0" to pinned forms (e.g., "pytest==8.4.1", "pytest-asyncio==1.0.0",
etc.); apply the same exact-version pinning to the equivalent entries in the dev
group as well so both test and dev dependency groups are deterministic.
In `@tests/unit/test_smoke.py`:
- Around line 1-32: Relocate the test module that defines
test_package_importable, test_version_format, and test_markers_registered out of
the unit-test area into the project's smoke-test folder so it is treated as a
smoke suite rather than a unit suite; ensure the file name remains test_smoke.py
and update any import/CI/test-discovery references accordingly so pytest
discovers it under the smoke tests grouping.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
.github/actions/setup-python-uv/action.yml.github/dependabot.yml.github/workflows/ci.yml.github/workflows/dependabot-auto-merge.yml.github/workflows/dependency-review.ymlREADME.mdpyproject.tomltests/conftest.pytests/e2e/conftest.pytests/integration/conftest.pytests/unit/conftest.pytests/unit/test_smoke.py
📜 Review details
🧰 Additional context used
🧠 Learnings (27)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Run `pytest` to verify tests pass before committing
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Before each commit, run `ruff format .` to format code, `ruff check .` to lint code (use `ruff check --fix .` to auto-fix), and `pytest` to run tests
Applied to files:
.github/workflows/ci.ymlpyproject.toml
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/tests/conftest.py : Place shared pytest fixtures in `tests/conftest.py`
Applied to files:
tests/unit/conftest.pytests/e2e/conftest.pytests/conftest.pytests/integration/conftest.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/unit/test_*.py : Write unit tests for new functionality using pytest. Place test files in `tests/unit/` with `test_*.py` naming convention.
Applied to files:
tests/unit/conftest.pytests/e2e/conftest.pytests/conftest.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/**/*.py : Use pytest fixtures for test setup. Shared fixtures should be in `tests/conftest.py`
Applied to files:
tests/unit/conftest.pytests/e2e/conftest.pytests/conftest.pytests/integration/conftest.py
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to tests/unit/test_*.py : Write unit tests for new functionality using pytest in `tests/unit/` with `test_*.py` naming convention
Applied to files:
tests/unit/conftest.pytests/e2e/conftest.pytests/conftest.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Each test should be independent and not rely on other tests; use pytest fixtures for test setup (shared fixtures in `tests/conftest.py`); clean up resources in teardown/fixtures
Applied to files:
tests/unit/conftest.pytests/unit/test_smoke.pytests/e2e/conftest.pytests/conftest.pytests/integration/conftest.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Use appropriate fixture scopes (`function`, `class`, `module`, `session`) and document complex fixtures with docstrings
Applied to files:
tests/unit/conftest.pytests/e2e/conftest.pytests/conftest.pytests/integration/conftest.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/smoke/test_*.py : Place smoke tests (quick startup validation tests) in `tests/smoke/` directory
Applied to files:
tests/unit/test_smoke.py
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Run `pytest` to verify tests pass before committing
Applied to files:
tests/unit/test_smoke.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/e2e/test_*.py : Place end-to-end browser tests in `tests/e2e/` directory (require playwright)
Applied to files:
tests/e2e/conftest.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/tests/{unit,integration,smoke,e2e}/test_*.py : Test files should mirror the structure of the code they test: `tests/unit/test_<module>.py` for unit tests, `tests/integration/test_<feature>.py` for integration tests, `tests/smoke/test_<component>.py` for smoke tests, `tests/e2e/test_<workflow>.py` for end-to-end tests
Applied to files:
tests/e2e/conftest.py
📚 Learning: 2026-01-24T09:54:56.100Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/instructions/test-files.instructions.md:0-0
Timestamp: 2026-01-24T09:54:56.100Z
Learning: Applies to **/test_*.py : Use `async def` for async test functions; pytest-asyncio is configured with `asyncio_mode = "auto"`; mark async tests with `pytest.mark.asyncio` if needed
Applied to files:
tests/conftest.pytests/integration/conftest.pypyproject.toml
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Applies to README.md : Update README.md for significant feature changes
Applied to files:
README.md
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: When making changes that affect architecture, services, key files, settings, or workflows, update the relevant sections of existing documentation (CLAUDE.md, README.md, etc.) to reflect those changes.
Applied to files:
README.md
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to tests/integration/test_*.py : Place integration tests for component interactions in `tests/integration/` directory
Applied to files:
tests/integration/conftest.py
📚 Learning: 2026-01-24T16:33:29.354Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-24T16:33:29.354Z
Learning: Never run full test suite scans - only run tests for specific files when needed (e.g., `pytest tests/unit/test_settings.py`). Full test runs take too long and should only be done by CI.
Applied to files:
tests/integration/conftest.py
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to pyproject.toml : Minimize external dependencies. When adding new dependencies, add them to `pyproject.toml` with exact version pins.
Applied to files:
pyproject.toml
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : NEVER add `from __future__ import annotations` - Python 3.14+ has native support for deferred annotation evaluation and does not require this import
Applied to files:
pyproject.toml
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : After making code changes, run `ruff format .` to format code with Ruff
Applied to files:
pyproject.toml
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : After making code changes, run `ruff check .` to lint code (use `ruff check --fix .` to auto-fix)
Applied to files:
pyproject.toml
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Line length: 100 characters (enforced by Ruff)
Applied to files:
pyproject.toml
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to **/*.py : Never dismiss warnings from `ruff check`, `pytest`, or other tools. Investigate and fix them, even if they appear unrelated to the current task.
Applied to files:
pyproject.toml
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Applies to **/*.py : Maintain line length at 100 characters per Ruff formatting rules.
Applied to files:
pyproject.toml
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : Python version: 3.14+
Applied to files:
pyproject.toml
📚 Learning: 2026-01-26T08:59:32.818Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-26T08:59:32.818Z
Learning: Applies to **/*.py : After making code changes, run `pytest` to run tests and verify nothing broke
Applied to files:
pyproject.toml
📚 Learning: 2026-02-26T17:43:50.902Z
Learnt from: CR
Repo: Aureliolo/story-factory PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-26T17:43:50.902Z
Learning: Never run full test suite scans during development—only run tests for specific files when needed (e.g., `pytest tests/unit/test_settings.py`).
Applied to files:
pyproject.toml
🔇 Additional comments (8)
.github/dependabot.yml (1)
7-8: Dependabot cadence update looks good.Weekly Monday scheduling for both ecosystems is consistent and reduces PR churn versus daily updates.
Also applies to: 22-23
.github/workflows/dependency-review.yml (1)
1-20: Solid dependency review baseline.Workflow trigger, minimal permissions, and
actions/dependency-review-action@v4with severity gating are well configured.tests/conftest.py (1)
6-9: Good shared async backend fixture setup.
anyio_backendin roottests/conftest.pyis the right location and keeps async test backend selection centralized.tests/e2e/conftest.py (1)
1-1: E2E conftest scaffold is fine.This provides clean structure for suite-specific e2e fixtures as tests are added.
tests/unit/conftest.py (1)
1-1: Unit conftest scaffold looks good.No issues here; this is a clean placeholder for unit-only fixtures.
tests/integration/conftest.py (1)
8-12: CI-gated integration skip logic is clear and practical.The opt-in pattern with
RUN_INTEGRATION_TESTSis explicit and keeps CI behavior predictable.README.md (1)
3-3: The CI badge repository slug is correct.The hard-coded slug
Aureliolo/ai-companyin line 3 matches the current repository..github/workflows/ci.yml (1)
19-20: The composite action.github/actions/setup-python-uv/action.ymlalready haspython-versiondefaulting to"3.14", so lint/type-check jobs (lines 19-20, 32-33) are already deterministic and consistent with the test job. No changes needed.Likely an incorrect or invalid review comment.
| @@ -0,0 +1,25 @@ | |||
| name: Dependabot Auto-Merge | |||
|
|
|||
| on: pull_request | |||
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
GitHub Actions GITHUB_TOKEN permissions pull_request vs pull_request_target Dependabot
💡 Result:
pull_request vs pull_request_target (what changes for GITHUB_TOKEN + secrets)
-
pull_request- Runs in the context of the PR’s code/merge commit, and is treated as untrusted when the actor is external (including Dependabot).
- For Dependabot-triggered runs, GitHub treats it like a fork:
GITHUB_TOKENis read-only and repo Actions secrets are not available (only Dependabot secrets are). [1][2]
-
pull_request_target- Runs in the context of the base branch (trusted context), and is intended for actions like labeling/commenting without executing PR code. GitHub explicitly warns not to use it to build/run untrusted PR code. [3][4]
- Important Dependabot nuance: even on
pull_request_target, if the PR base ref was created by Dependabot,GITHUB_TOKENis still read-only and secrets are not available. [1]
Practical guidance (especially with Dependabot)
- Use
pull_requestfor CI/tests that execute PR code (safe default). - Use
pull_request_targetonly for privileged, non-code-executing tasks (triage, labels, comments), and keep token scopes minimal viapermissions:. [4][5] - If you need a privileged follow-up step (write token / access to Actions secrets) after an untrusted PR workflow, use a split design (e.g.,
workflow_run) rather than running PR code underpull_request_target. (GitHub recommends a two-step approach for Dependabot-related limitations.) [6]
Setting / tightening GITHUB_TOKEN permissions
You can explicitly set minimal permissions at workflow/job level with permissions: (recommended least-privilege). [5]
Example (safe PR CI):
on: pull_request
permissions:
contents: readExample (PR triage only):
on: pull_request_target
permissions:
pull-requests: write
contents: readKey takeaway for Dependabot: if your workflow fails because it tries to write (comment, label, push tags, etc.) on a Dependabot PR, it’s usually because the run received a read-only GITHUB_TOKEN and no Actions secrets under these events—by design. [1][2]
Sources: GitHub Docs / Changelog on Dependabot restrictions and token/secrets behavior [1][2], GitHub docs on pull_request_target context and warnings [3], GitHub security guidance on pull_request_target risks [4], GitHub docs on configuring GITHUB_TOKEN permissions [5], and GitHub troubleshooting guidance for Dependabot + Actions patterns [6].
🌐 Web query:
GitHub Dependabot auto-merge pull_request_target recommended approach
💡 Result:
Recommended approach (avoid pull_request_target for auto-merge)
Use pull_request + GitHub’s native auto-merge, and gate it tightly to Dependabot + allowed update types (patch/minor), using Dependabot metadata. GitHub’s own Dependabot automation guide shows this pattern (on: pull_request, explicit permissions, dependabot/fetch-metadata, then gh pr merge --auto ...).[1]
Minimal sketch (adapt the conditions to your policy):
name: Dependabot auto-merge
on: pull_request
permissions:
contents: write
pull-requests: write
jobs:
dependabot-automerge:
if: github.event.pull_request.user.login == 'dependabot[bot]'
runs-on: ubuntu-latest
steps:
- uses: dependabot/fetch-metadata@<pinned-sha>
id: metadata
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
- name: Enable auto-merge
if: steps.metadata.outputs.update-type == 'version-update:semver-patch'
run: gh pr merge --auto --merge "$PR_URL"
env:
PR_URL: ${{ github.event.pull_request.html_url }}
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}Also enable branch protection required status checks so nothing merges without CI passing.[1]
Why pull_request_target is generally not recommended here
- Security:
pull_request_targetis a privileged trigger; using it incorrectly (especially if you checkout/run PR code) can lead to “pwn request” secret/token exfiltration.[2][3] GitHub’s hardening guidance explicitly advises avoidingpull_request_targetunless you truly need that privileged context, and to never combine it with untrusted checkout/execution.[3] - It doesn’t solve the Dependabot secret/token limitation the way people expect: GitHub documents that for Dependabot-triggered workflows, secrets aren’t available, and even with
pull_request_target(for Dependabot-authored PRs) the token is read-only and secrets are not available.[4] (This is part of GitHub’s broader change to treat Dependabot-triggered runs like fork PRs by default.)[5]
When pull_request_target is appropriate (if you insist)
Only use it for non-checkout, non-execution operations on trusted PR authors (e.g., labeling/commenting), and keep the workflow minimal—no actions/checkout, no running PR-provided code.[2][3]
If you need a privileged “merge” step but want better separation, GitHub recommends privilege separation patterns (often via workflow_run) instead of relying on pull_request_target.[3]
Sources: GitHub Docs Dependabot automation example[1], GitHub Security Lab pwn-request guidance[2], GitHub Actions hardening guidance[3], GitHub Docs Dependabot-on-Actions restrictions[4], GitHub changelog on Dependabot read-only permissions[5].
[1] (docs.github.com)
[2] (securitylab.github.com)
[3] (docs.github.com)
[4] (docs.github.com)
[5] (github.blog)
Citations:
- 1: https://docs.github.com/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/automating-dependabot-with-github-actions?learn=dependency_version_updates
- 2: https://securitylab.github.com/resources/github-actions-preventing-pwn-requests/?utm_source=openai
- 3: https://docs.github.com/github-ae%40latest/actions/security-guides/security-hardening-for-github-actions?utm_source=openai
- 4: https://docs.github.com/en/code-security/reference/supply-chain-security/dependabot-on-actions
- 5: https://github.blog/changelog/2021-02-19-github-actions-workflows-triggered-by-dependabot-prs-will-run-with-read-only-permissions/?utm_source=openai
🌐 Web query:
pull_request_target security implications GitHub Actions Dependabot
💡 Result:
pull_request_target: core security implication
pull_request_targetruns the workflow in the context of the base repository (not the PR branch). That means it can have access to repository secrets and a write-capableGITHUB_TOKEN, unless GitHub applies additional restrictions (see Dependabot notes below).- The main risk (“pwn request”) happens when a
pull_request_targetworkflow checks out and executes PR-controlled code (e.g.,actions/checkoutof the PR’shead.sha, running scripts from the PR). An attacker can then exfiltrate secrets or use the token to modify the repo. GitHub explicitly warns against checking out untrusted PR code with privileged triggers likepull_request_target(andworkflow_run). [2], [3], [4]
How this interacts with Dependabot PRs
GitHub intentionally restricts permissions/secrets for workflows triggered by Dependabot to reduce blast radius:
- For Dependabot-triggered workflows on events like
pull_request/push:GITHUB_TOKENis read-only by default and repo Actions secrets are not available (Dependabot uses its own “Dependabot secrets” store for eligible cases). [1], [5] - For
pull_request_targetspecifically: if the PR is from Dependabot (and GitHub’s condition matches—e.g., base ref created by Dependabot), the workflow will not receive secrets and the token will be read-only, even thoughpull_request_targetis normally privileged. This was enforced via platform changes to prevent privileged execution from Dependabot-triggered contexts. [1], [6]
Practical takeaways / safer patterns
- Avoid
pull_request_targetunless you truly need base-repo privileges (e.g., labeling/commenting). [3], [4] - If you must use it, do not check out or run PR code in that job; treat PR content as untrusted input. [2], [3], [4]
- Prefer least-privilege
permissions:(explicitly set only what you need) and consider splitting into:- an unprivileged
pull_requestworkflow for building/testing PR code, and - a privileged follow-up via
workflow_run(carefully handling artifacts) for actions that need secrets/write access. [2], [3]
- an unprivileged
- Real-world failures do occur: publicly disclosed incidents show
pull_request_target+ checkout/execute PR code can lead to secret exfiltration. [7]
Sources: GitHub Docs on Dependabot + Actions restrictions [1], GitHub hardening guidance for Actions [2], GitHub Well-Architected Actions security guidance [4], GitHub changelog notes on Dependabot permission restrictions [5], [6], and a concrete pull_request_target exfiltration advisory [7].
Add explicit permissions to this workflow to enable Dependabot auto-merge; do not switch to pull_request_target.
The pull_request event itself is correct for auto-merge. The actual issue is that Dependabot-triggered workflows on pull_request receive a read-only GITHUB_TOKEN by design. This causes gh pr merge to fail silently.
GitHub explicitly discourages using pull_request_target for Dependabot auto-merge (it still restricts Dependabot-triggered runs to read-only, and introduces unnecessary security risks). Instead, explicitly grant write permissions at the workflow level.
🔧 Proposed fix
on: pull_request
+permissions:
+ contents: write
+ pull-requests: write
+
jobs:
auto-merge:
runs-on: ubuntu-latest
if: github.actor == 'dependabot[bot]'For additional safety, consider using the dependabot/fetch-metadata action to filter by update type (e.g., patch/minor only).
Also applies to: 22-25
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/dependabot-auto-merge.yml at line 3, Update the workflow
to grant the GITHUB_TOKEN write permissions so Dependabot-triggered runs can
perform merges: add a top-level permissions block (e.g., permissions:
pull-requests: write and contents: write) to the workflow that uses on:
pull_request; do not switch to pull_request_target. Optionally, add the
dependabot/fetch-metadata action in the job to filter Dependabot events by
update type (patch/minor) before running gh pr merge.
| """Smoke tests to verify project setup.""" | ||
|
|
||
| import re | ||
|
|
||
| import pytest | ||
|
|
||
|
|
||
| @pytest.mark.unit | ||
| def test_package_importable() -> None: | ||
| """Verify the ai_company package can be imported.""" | ||
| import ai_company | ||
|
|
||
| assert hasattr(ai_company, "__version__") | ||
|
|
||
|
|
||
| @pytest.mark.unit | ||
| def test_version_format() -> None: | ||
| """Verify version string follows semver format.""" | ||
| from ai_company import __version__ | ||
|
|
||
| pattern = r"^\d+\.\d+\.\d+([a-zA-Z0-9.+-]+)?$" | ||
| assert re.match(pattern, __version__), f"Version {__version__!r} is not semver" | ||
|
|
||
|
|
||
| @pytest.mark.unit | ||
| def test_markers_registered(pytestconfig: pytest.Config) -> None: | ||
| """Verify custom markers are registered (strict-markers won't fail).""" | ||
| raw_markers: list[str] = pytestconfig.getini("markers") # type: ignore[assignment] | ||
| marker_names = {m.split(":")[0].strip() for m in raw_markers} | ||
| expected = {"unit", "integration", "e2e", "slow"} | ||
| missing = expected - marker_names | ||
| assert expected.issubset(marker_names), f"Missing markers: {missing}" |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Move this smoke suite to tests/smoke/ to match test-organization conventions.
Line 1 labels these as smoke tests, but they are currently under tests/unit/. Relocating to tests/smoke/test_smoke.py keeps suite semantics and discovery conventions aligned.
Based on learnings: Applies to tests/smoke/test_*.py : Place smoke tests (quick startup validation tests) in tests/smoke/ directory.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/test_smoke.py` around lines 1 - 32, Relocate the test module that
defines test_package_importable, test_version_format, and
test_markers_registered out of the unit-test area into the project's smoke-test
folder so it is treated as a smoke suite rather than a unit suite; ensure the
file name remains test_smoke.py and update any import/CI/test-discovery
references accordingly so pytest discovers it under the smoke tests grouping.
- Revert dependabot schedule to daily (user preference) - Enforce 80% coverage in CI with --cov-fail-under and --cov-report=term-missing - Switch auto-merge workflow to pull_request_target for write token access - Fix command injection in composite action (use env vars not direct interpolation) - Pin Python version in uv sync --python for interpreter consistency - Remove dead anyio_backend fixture (project uses pytest-asyncio) - Update deprecated TCH rule prefix to TC for ruff compatibility - Tighten version regex to enforce proper semver separators - Remove unused request parameter from integration skip fixture - Pin setup-uv action to v6.0.0 for reproducibility - Improve docstrings for integration fixture and marker test - Pin all dependency versions to exact matches (== instead of >=) - Update lockfile to match pinned versions
🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>
Summary
warn_unreachable,show_error_codes,enable_error_code) andpydantic-mypyplugin configfrom __future__ import annotationsvia ruffbanned-api(unnecessary on Python 3.14+, PEP 649)New Files (9)
.github/workflows/ci.yml.github/workflows/dependabot-auto-merge.yml.github/workflows/dependency-review.yml.github/actions/setup-python-uv/action.ymltests/conftest.pytests/unit/conftest.pytests/integration/conftest.pytests/e2e/conftest.pytests/unit/test_smoke.pyModified Files (3)
pyproject.toml.github/dependabot.ymlREADME.mdTest plan
uv run mypy src/passes with new strict flagsuv run pytest tests/ -v— 3 smoke tests passuv run pytest tests/ --cov=ai_company --cov-report=term-missing— 100% coverageuv run ruff check src/ tests/— no lint errorsuv run pre-commit run --all-files— all hooks passCloses #25, #35, #51, #52