Skip to content

feat: add configurable cost tiers and subscription/quota-aware tracking (#67)#185

Merged
Aureliolo merged 3 commits intomainfrom
feat/cost-tiers
Mar 9, 2026
Merged

feat: add configurable cost tiers and subscription/quota-aware tracking (#67)#185
Aureliolo merged 3 commits intomainfrom
feat/cost-tiers

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

  • Configurable cost tiers (budget/cost_tiers.py): CostTierDefinition model with price ranges, display properties, and classify_model_tier() function. Built-in 4-tier defaults (low/medium/high/premium) with user override/merge via resolve_tiers()
  • Quota & subscription models (budget/quota.py): SubscriptionConfig, QuotaLimit, QuotaWindow, ProviderCostModel, DegradationConfig, QuotaSnapshot, QuotaCheckResult — frozen Pydantic models for provider subscription/quota configuration
  • Quota tracking service (budget/quota_tracker.py): QuotaTracker with per-provider, per-window (minute/hour/day/month) request and token tracking, automatic window rotation, async-safe via asyncio.Lock
  • BudgetEnforcer integration: Pre-flight check_can_execute() now includes provider quota checks; check_quota() public API; QuotaExhaustedError raised on exhaustion
  • Config schema: ProviderConfig gains subscription + degradation fields; RootConfig gains cost_tiers field
  • Documentation: Updated DESIGN_SPEC.md (§1.4, §9.2, §10.4, §15.3, §15.5), CLAUDE.md, README.md

Closes #67

Test plan

  • 14 new tests added for review findings (exhaustion reason with estimated tokens, deep copy isolation, multi-window exhaustion, input validation, cross-field validation, graceful degradation)
  • All 4593 tests pass (6 skipped — symlinks/real LLM)
  • 96.35% coverage (80% minimum)
  • mypy strict: 0 errors
  • ruff: all checks passed
  • Pre-commit hooks: all passed

Review coverage

Pre-reviewed by 9 agents (code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, resilience-audit, docs-consistency). 34 findings addressed, 0 skipped.

…ng (#67)

Add two sub-systems for BudgetEnforcer:
- Cost tier definitions with configurable metadata, price ranges, and
  model classification (CostTierDefinition, resolve_tiers, classify_model_tier)
- Subscription/quota tracking with per-provider rate/token/request caps,
  three cost models (per_token, subscription, local), and graceful
  degradation strategies (QuotaTracker, SubscriptionConfig, DegradationConfig)

Integrate quota checks into BudgetEnforcer.check_can_execute() and add
cost_tiers to RootConfig, subscription/degradation to ProviderConfig.
Pre-reviewed by 9 agents, 34 findings addressed:
- Fix _build_exhaustion_reason ignoring estimated_tokens (bug)
- Re-export QuotaExhaustedError from engine/__init__.py
- Fix misleading docstrings about degradation strategy
- Fix _patch_periods() creating duplicate context managers
- Add input validation for negative requests/tokens
- Add DEBUG logging for unknown providers in QuotaTracker
- Compute window_resets_at in QuotaSnapshot
- Add QuotaCheckResult cross-field validation
- Add negative cost guard in classify_model_tier
- Update DESIGN_SPEC.md, CLAUDE.md, README.md docs
- Add 14 new tests for coverage gaps
- Code simplification via single-pass validators
Copilot AI review requested due to automatic review settings March 9, 2026 11:44
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 9, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 9, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f1d5a084-07cd-47ca-ae07-dceeff5c6c11

📥 Commits

Reviewing files that changed from the base of the PR and between 02baced and 05fbbc8.

📒 Files selected for processing (10)
  • DESIGN_SPEC.md
  • src/ai_company/budget/cost_tiers.py
  • src/ai_company/budget/enforcer.py
  • src/ai_company/budget/quota.py
  • src/ai_company/budget/quota_tracker.py
  • src/ai_company/config/schema.py
  • tests/unit/budget/test_cost_tiers.py
  • tests/unit/budget/test_enforcer_quota.py
  • tests/unit/budget/test_quota.py
  • tests/unit/budget/test_quota_tracker.py

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Configurable cost tiers and model-tier classification.
    • Per-provider quota tracking across minute/hour/day/month windows with pre-flight checks.
    • Provider subscription and degradation strategies (fallback, queue, alert) added to configuration.
    • Budget enforcement now includes quota checks and a new quota-exhausted error.
  • Observability

    • New quota- and budget-related event identifiers for monitoring.
  • Tests

    • Extensive unit tests added for cost tiers, quota models, quota tracker, and enforcer integration.

Walkthrough

Adds configurable cost tiers, provider subscription/quota models, a windowed QuotaTracker service, quota-aware BudgetEnforcer integration (with QuotaExhaustedError), observability events for quota/budget, config/schema extensions, default config key, and comprehensive unit tests for the new budgeting surface.

Changes

Cohort / File(s) Summary
Cost Tier System
src/ai_company/budget/cost_tiers.py, src/ai_company/observability/events/budget.py
New CostTierDefinition and CostTiersConfig models, BUILTIN_TIERS, resolve/classify functions, and two budget event constants for tier resolution/classify misses.
Quota Domain Models
src/ai_company/budget/quota.py
New quota/subscription models and enums: QuotaWindow, QuotaLimit, ProviderCostModel, SubscriptionConfig, DegradationConfig/DegradationAction, QuotaSnapshot, QuotaCheckResult, plus utilities window_start and effective_cost_per_1k.
Quota Tracking Service
src/ai_company/budget/quota_tracker.py
New QuotaTracker implementing windowed per-provider usage, record_usage, check_quota, snapshot APIs, window rotation, concurrency protection, and quota-related observability events.
Budget Enforcer Integration
src/ai_company/budget/enforcer.py, src/ai_company/engine/errors.py
BudgetEnforcer accepts optional quota_tracker, check_can_execute accepts provider_name/estimated_tokens, adds check_quota and provider quota check path; introduces QuotaExhaustedError.
Public API Exports
src/ai_company/budget/__init__.py, src/ai_company/engine/__init__.py
Re-exports new cost/ quota types and QuotaTracker; exposes QuotaExhaustedError on engine public API.
Configuration & Defaults
src/ai_company/config/schema.py, src/ai_company/config/defaults.py
Adds ProviderConfig.subscription and .degradation fields; adds RootConfig.cost_tiers; default config gains cost_tiers key.
Observability Events
src/ai_company/observability/events/quota.py
New quota event constants (QUOTA_TRACKER_CREATED, QUOTA_USAGE_RECORDED, QUOTA_CHECK_ALLOWED, QUOTA_CHECK_DENIED, QUOTA_WINDOW_ROTATED, QUOTA_SNAPSHOT_QUERIED, QUOTA_USAGE_SKIPPED).
Documentation / Design
CLAUDE.md, DESIGN_SPEC.md, README.md
Updated docs/design snapshots and README to reflect configurable cost tiers and quota/subscription tracking in budget enforcement.
Tests & Factories
tests/unit/budget/conftest.py, tests/unit/config/conftest.py, tests/unit/observability/test_events.py, tests/unit/budget/test_cost_tiers.py, tests/unit/budget/test_quota.py, tests/unit/budget/test_quota_tracker.py, tests/unit/budget/test_enforcer_quota.py
Adds factories for new models, make_quota_tracker helper, extends config factories, updates event discovery test, and adds extensive unit tests covering tiers, quota models, tracker behavior, and BudgetEnforcer quota integration.

Sequence Diagram(s)

sequenceDiagram
    participant Agent
    participant BudgetEnforcer as BudgetEnforcer
    participant QuotaTracker
    participant Observability as Observability

    Agent->>BudgetEnforcer: check_can_execute(agent_id, provider_name, estimated_tokens)
    alt quota_tracker configured & provider_name provided
        BudgetEnforcer->>QuotaTracker: check_quota(provider_name, estimated_tokens)
        QuotaTracker->>QuotaTracker: evaluate windows (per_minute/hour/day/month)
        alt allowed
            QuotaTracker->>Observability: emit QUOTA_CHECK_ALLOWED
            QuotaTracker-->>BudgetEnforcer: QuotaCheckResult(allowed=true)
            BudgetEnforcer-->>Agent: proceed
        else denied
            QuotaTracker->>Observability: emit QUOTA_CHECK_DENIED
            QuotaTracker-->>BudgetEnforcer: QuotaCheckResult(allowed=false, reason)
            BudgetEnforcer->>Observability: log QUOTA_CHECK_DENIED
            BudgetEnforcer-->>Agent: raise QuotaExhaustedError
        end
    else no quota_tracker or no provider_name
        BudgetEnforcer->>Observability: debug skip quota check
        BudgetEnforcer-->>Agent: proceed
    end
Loading
sequenceDiagram
    participant Engine
    participant QuotaTracker
    participant Provider
    participant Monitor

    Engine->>QuotaTracker: check_quota(provider, estimated_tokens)
    QuotaTracker->>QuotaTracker: get current snapshots
    alt within quota
        QuotaTracker-->>Engine: allowed=true
        Engine->>Provider: perform request
        Provider-->>Engine: response
        Engine->>QuotaTracker: record_usage(provider, requests=1, tokens=N)
        QuotaTracker->>Observability: emit QUOTA_USAGE_RECORDED
    else exhausted
        QuotaTracker-->>Engine: allowed=false (reason)
        Engine->>Observability: log quota denial
    end

    Monitor->>QuotaTracker: time boundary crossed
    QuotaTracker->>QuotaTracker: rotate window counters
    QuotaTracker->>Observability: emit QUOTA_WINDOW_ROTATED
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding configurable cost tiers and subscription/quota-aware tracking.
Description check ✅ Passed The description clearly explains the changeset with specific modules, features, and integration points, directly relating to the code changes.
Linked Issues check ✅ Passed The PR meets all key acceptance criteria from issue #67: CostTierDefinition with display properties, built-in 4-tier defaults with user override, SubscriptionConfig and quota models, QuotaTracker with multi-window tracking, BudgetEnforcer integration, and config schema updates.
Out of Scope Changes check ✅ Passed All code changes directly support the stated objectives: cost tier definitions, subscription/quota models, quota tracking service, BudgetEnforcer integration, config updates, and comprehensive tests.
Docstring Coverage ✅ Passed Docstring coverage is 96.02% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/cost-tiers
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch feat/cost-tiers

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the system's ability to manage and enforce costs and resource usage for AI model providers. It introduces a flexible framework for defining and classifying models into configurable cost tiers, alongside a robust quota and subscription tracking mechanism. By integrating these features into the existing budget enforcement, the system can now perform granular pre-flight checks, prevent overages, and lay the groundwork for graceful degradation strategies, thereby improving cost control and operational stability.

Highlights

  • Configurable Cost Tiers: Introduced CostTierDefinition and CostTiersConfig models, allowing for flexible definition of cost tiers based on price ranges, display properties, and classification functions. Built-in default tiers (low/medium/high/premium) are provided, with support for user overrides and merging.
  • Quota and Subscription Models: Added new Pydantic models (SubscriptionConfig, QuotaLimit, QuotaWindow, ProviderCostModel, DegradationConfig, QuotaSnapshot, QuotaCheckResult) to define provider subscription plans, usage limits across various time windows, and strategies for graceful degradation when quotas are approached.
  • Quota Tracking Service: Implemented a QuotaTracker service responsible for monitoring per-provider, per-window request and token usage. This service handles automatic window rotation and ensures concurrency safety using asyncio.Lock.
  • BudgetEnforcer Integration: The existing BudgetEnforcer was enhanced to incorporate provider quota checks into its pre-flight check_can_execute() method. A new public check_quota() API was added, and a QuotaExhaustedError is now raised when a provider's quota is depleted.
  • Configuration Schema Updates: The main configuration schema (RootConfig) now includes a cost_tiers field for defining custom cost tiers. Additionally, ProviderConfig now supports subscription and degradation fields to configure provider-specific quota limits and degradation behaviors.
  • Documentation Updates: Key documentation files (DESIGN_SPEC.md, CLAUDE.md, README.md) were updated to reflect the new cost tier and quota tracking functionalities, providing clearer guidance on their implementation and usage.
Changelog
  • CLAUDE.md
    • Updated the description of the budget/ directory to include cost tiers and quota/subscription tracking.
  • DESIGN_SPEC.md
    • Updated the M5 implementation snapshot to reflect the completion of budget enforcement, including configurable cost tiers and quota/subscription tracking.
    • Added an implementation note for M5 detailing the new subscription and degradation fields in ProviderConfig.
    • Updated the BudgetEnforcer implementation note to include QuotaTracker and ModelResolver in its composition and to mention provider quota enforcement in pre-flight checks.
    • Added a new entry to the conventions table for "Cost tiers & quota tracking" with details on their implementation and benefits.
  • README.md
    • Updated the "Budget Enforcement (M5)" entry to explicitly mention configurable cost tiers and quota/subscription tracking.
  • src/ai_company/budget/init.py
    • Added imports for CostTierDefinition, CostTiersConfig, classify_model_tier, resolve_tiers, QuotaLimit, QuotaWindow, SubscriptionConfig, DegradationAction, DegradationConfig, ProviderCostModel, QuotaSnapshot, QuotaCheckResult, QuotaTracker, and effective_cost_per_1k.
    • Exported new cost tier and quota-related symbols in __all__.
  • src/ai_company/budget/cost_tiers.py
    • Added a new file defining CostTierDefinition (metadata for a single tier), CostTiersConfig (configuration for tiers), BUILTIN_TIERS (default tier definitions), resolve_tiers (merges built-in and user-defined tiers), and classify_model_tier (classifies model cost into a tier).
  • src/ai_company/budget/enforcer.py
    • Imported QuotaCheckResult and QuotaExhaustedError.
    • Imported new observability events for quota checks.
    • Modified the BudgetEnforcer constructor to accept an optional QuotaTracker instance.
    • Updated check_can_execute to include an optional provider_name argument and perform provider quota checks if a QuotaTracker is configured.
    • Added a new public asynchronous method check_quota to delegate quota checks to the QuotaTracker.
    • Added a private asynchronous method _check_provider_quota to handle raising QuotaExhaustedError.
    • Added a helper function _always_allowed_result for QuotaCheckResult.
  • src/ai_company/budget/quota.py
    • Added a new file defining QuotaWindow (time window enum), QuotaLimit (single quota limit), ProviderCostModel (how a provider charges), SubscriptionConfig (subscription and quota configuration), DegradationAction (action on quota exhaustion), DegradationConfig (degradation strategy configuration), QuotaSnapshot (point-in-time usage snapshot), and QuotaCheckResult (result of a quota check).
    • Included utility functions window_start (computes UTC-aware start of a window) and effective_cost_per_1k (computes cost based on provider model).
  • src/ai_company/budget/quota_tracker.py
    • Added a new file implementing the QuotaTracker service, which tracks per-provider request and token usage across configured time windows.
    • Implemented record_usage to update counters and handle window rotation.
    • Implemented check_quota for pre-flight checks against limits.
    • Implemented get_snapshot and get_all_snapshots to retrieve current usage data.
    • Included internal helper functions for exhaustion checks and reason building.
  • src/ai_company/config/defaults.py
    • Added cost_tiers to the default configuration dictionary.
  • src/ai_company/config/schema.py
    • Imported CostTiersConfig, DegradationConfig, and SubscriptionConfig.
    • Updated ProviderConfig to include subscription and degradation fields.
    • Updated RootConfig to include a cost_tiers field.
  • src/ai_company/engine/init.py
    • Exported QuotaExhaustedError.
  • src/ai_company/engine/errors.py
    • Defined QuotaExhaustedError as a new exception class, subclassing BudgetExhaustedError.
  • src/ai_company/observability/events/budget.py
    • Added new event constants BUDGET_TIER_RESOLVED and BUDGET_TIER_CLASSIFY_MISS.
  • src/ai_company/observability/events/quota.py
    • Added a new file defining event constants for quota tracking, such as QUOTA_TRACKER_CREATED, QUOTA_USAGE_RECORDED, QUOTA_CHECK_ALLOWED, QUOTA_CHECK_DENIED, QUOTA_WINDOW_ROTATED, QUOTA_SNAPSHOT_QUERIED, and QUOTA_USAGE_SKIPPED.
  • tests/unit/budget/conftest.py
    • Added CostTierDefinitionFactory, CostTiersConfigFactory, QuotaLimitFactory, and SubscriptionConfigFactory for testing.
    • Added a make_quota_tracker helper function for creating QuotaTracker instances in tests.
  • tests/unit/budget/test_cost_tiers.py
    • Added a new file with unit tests for CostTierDefinition model validation, CostTiersConfig behavior, BUILTIN_TIERS properties, resolve_tiers functionality (merging and overrides), and classify_model_tier logic (classification boundaries).
  • tests/unit/budget/test_enforcer_quota.py
    • Added a new file with unit tests for BudgetEnforcer's integration with QuotaTracker, covering scenarios where quota is allowed, exhausted, skipped, and handling of unexpected errors.
  • tests/unit/budget/test_quota.py
    • Added a new file with comprehensive unit tests for all new quota and subscription domain models: QuotaWindow, QuotaLimit, ProviderCostModel, SubscriptionConfig, DegradationAction, DegradationConfig, QuotaSnapshot, and QuotaCheckResult.
    • Tested utility functions window_start and effective_cost_per_1k.
    • Included tests for validation rules, computed properties, and edge cases like negative inputs or over-limit usage.
  • tests/unit/budget/test_quota_tracker.py
    • Added a new file with unit tests for the QuotaTracker service, covering its construction, record_usage (including accumulation and window rotation), check_quota (including estimated tokens and multiple windows), get_snapshot, and get_all_snapshots.
    • Included tests for deep copy isolation and input validation.
  • tests/unit/config/conftest.py
    • Updated ProviderConfigFactory to include default SubscriptionConfig and DegradationConfig.
    • Updated RootConfigFactory to include default CostTiersConfig.
  • tests/unit/observability/test_events.py
    • Added 'quota' to the list of domain modules checked for event constant discovery.
Activity
  • 14 new tests were added to cover review findings, including exhaustion reasons with estimated tokens, deep copy isolation, multi-window exhaustion, input validation, cross-field validation, and graceful degradation.
  • All 4593 existing tests passed, with 6 tests skipped (related to symlinks/real LLM interactions).
  • Code coverage stands at 96.35%, exceeding the 80% minimum requirement.
  • Mypy strict type checking reported 0 errors.
  • Ruff linter checks passed without issues.
  • All pre-commit hooks passed successfully.
  • The pull request underwent pre-review by 9 automated agents (code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, resilience-audit, docs-consistency), leading to 34 addressed findings and 0 skipped.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Mar 9, 2026

Greptile Summary

This PR completes the M5 budget enforcement layer by adding three tightly-integrated components: configurable cost tiers (cost_tiers.py), a comprehensive quota/subscription model layer (quota.py), and an async-safe per-provider quota tracking service (quota_tracker.py). BudgetEnforcer is extended with an optional QuotaTracker injection point, and QuotaExhaustedError (a BudgetExhaustedError subclass) is wired into the existing pre-flight exception hierarchy. The implementation is well-tested (96% coverage, mypy strict, ruff clean) and unimplemented FALLBACK/QUEUE strategies are correctly documented in both code and the design spec.

Key findings:

  • Logic — snapshot / enforcement token-boundary divergence (quota.py:278): QuotaSnapshot.is_exhausted uses >= for tokens (at-limit counts as exhausted) while _is_window_exhausted uses strict > (exact-fill is permitted). When all tokens are consumed and a check is made with zero estimated projection, the snapshot signals exhaustion but check_quota returns allowed. Since check_can_execute defaults the projection to zero, this discrepancy is always present in the common code path.
  • Style — QUOTA_WINDOW_ROTATED logged at DEBUG (quota_tracker.py:162): CLAUDE.md requires all state transitions to log at INFO. A window rotation resets quota counters and directly affects subsequent enforcement decisions; it warrants INFO-level visibility.
  • Style — effective_cost_per_1k accepts negative cost components silently (quota.py:377): No validation or warning is emitted when either cost argument is negative, producing a semantically invalid result that classifies silently without surfacing the anomaly.

Confidence Score: 3/5

  • Mostly safe to merge; one behavioral inconsistency between the snapshot API and enforcement predicate warrants a fix before shipping the snapshot as a public observability surface.
  • The PR is well-engineered overall — strong test coverage, mypy strict, correct exception hierarchy, acknowledged TOCTOU, and documented unimplemented strategies. The score is held at 3 due to the logic inconsistency between QuotaSnapshot.is_exhausted and _is_window_exhausted for the token-at-limit boundary: the two APIs give opposite answers for the same state, which will mislead any operator or downstream consumer that relies on the snapshot to predict enforcement behavior. The two style issues (log level on window rotation, no negative-cost guard in effective_cost_per_1k) are low-risk but worth addressing per project conventions.
  • Pay close attention to src/ai_company/budget/quota.py — specifically the QuotaSnapshot.is_exhausted property (line 278) and the effective_cost_per_1k helper (line 377).

Important Files Changed

Filename Overview
src/ai_company/budget/quota.py New file: quota and subscription domain models. QuotaSnapshot.is_exhausted uses >= for token exhaustion, but the enforcement predicate _is_window_exhausted uses > (exact-fill permitted), causing a behavioral divergence between the snapshot API and actual enforcement. effective_cost_per_1k also accepts negative cost components without validation or logging.
src/ai_company/budget/quota_tracker.py New file: async-safe per-provider quota tracking with window rotation. Logic is correct and well-tested. QUOTA_WINDOW_ROTATED is emitted at DEBUG level; per CLAUDE.md, state transitions should be logged at INFO. TOCTOU gap between check and record is acknowledged in the class docstring.
src/ai_company/budget/enforcer.py Extended with optional QuotaTracker injection and check_quota / _check_provider_quota methods. QuotaExhaustedError (a BudgetExhaustedError subclass) is correctly re-raised by the except BudgetExhaustedError: raise guard. The public check_quota method correctly delegates and documents its different error-handling contract vs check_can_execute.
src/ai_company/budget/cost_tiers.py New file: configurable cost tier definitions with merge/override semantics. Model validators, boundary classification, and logging are all well-structured. No issues found.
src/ai_company/config/schema.py Adds subscription/degradation fields to ProviderConfig and cost_tiers to RootConfig. Cross-field validator correctly checks that fallback_providers reference known providers. Clean integration.
src/ai_company/engine/errors.py Adds QuotaExhaustedError as a subclass of BudgetExhaustedError. Docstring correctly notes that FALLBACK/QUEUE degradation routing is unimplemented. Correct hierarchy placement.
tests/unit/budget/test_quota_tracker.py Comprehensive tests covering construction, record_usage, check_quota, snapshots, window rotation (with time mocking), deep-copy isolation, multi-window exhaustion, and projected-token denial. No issues found.
tests/unit/budget/test_enforcer_quota.py Tests for BudgetEnforcer quota integration including graceful degradation on unexpected exceptions. Well-structured with clear helpers and deterministic time patching. No issues found.
src/ai_company/observability/events/quota.py New file: quota event constants. All 7 event constants are well-named and follow established conventions.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant BudgetEnforcer
    participant QuotaTracker
    participant QuotaSnapshot

    Caller->>BudgetEnforcer: check_can_execute(agent_id, provider_name, estimated_tokens)
    BudgetEnforcer->>BudgetEnforcer: _check_monthly_hard_stop()
    BudgetEnforcer->>BudgetEnforcer: _check_daily_limit()
    BudgetEnforcer->>BudgetEnforcer: _check_provider_quota(agent_id, provider_name, estimated_tokens)
    BudgetEnforcer->>QuotaTracker: check_quota(provider_name, estimated_tokens)
    QuotaTracker->>QuotaTracker: _is_window_exhausted(usage, quota, estimated_tokens)
    alt quota exhausted
        QuotaTracker-->>BudgetEnforcer: QuotaCheckResult(allowed=False, reason=..., exhausted_windows=...)
        BudgetEnforcer-->>Caller: raise QuotaExhaustedError
    else quota OK
        QuotaTracker-->>BudgetEnforcer: QuotaCheckResult(allowed=True)
        BudgetEnforcer-->>Caller: return (execution allowed)
    end

    Caller->>QuotaTracker: record_usage(provider_name, requests, tokens)
    QuotaTracker->>QuotaTracker: rotate window if boundary crossed
    QuotaTracker->>QuotaTracker: accumulate counters

    Caller->>QuotaTracker: get_snapshot(provider_name)
    QuotaTracker-->>QuotaSnapshot: build QuotaSnapshot(requests_used, tokens_used, ...)
    QuotaSnapshot-->>Caller: snapshot with is_exhausted, requests_remaining, tokens_remaining
Loading

Last reviewed commit: 05fbbc8

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds configurable cost-tier definitions and provider subscription/quota awareness to the budget enforcement layer, introducing new budget-domain models/services and wiring them into config + observability.

Changes:

  • Introduces configurable cost tiers (CostTierDefinition, CostTiersConfig, resolve_tiers(), classify_model_tier()) and exposes them via config and observability events.
  • Adds subscription/quota domain models (SubscriptionConfig, QuotaLimit, QuotaSnapshot, etc.) and an async-safe QuotaTracker for per-provider, per-window usage tracking.
  • Integrates quota checks into BudgetEnforcer pre-flight execution checks and adds associated errors/events/docs/tests.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/unit/observability/test_events.py Updates domain-module discovery expectations to include new quota events module.
tests/unit/config/conftest.py Extends config factories to include cost_tiers, subscription, and degradation defaults.
tests/unit/budget/test_quota_tracker.py Adds unit tests covering QuotaTracker behavior (recording, rotation, snapshots, validation).
tests/unit/budget/test_quota.py Adds unit tests for quota/subscription models and helper functions.
tests/unit/budget/test_enforcer_quota.py Adds unit tests validating BudgetEnforcer quota integration and error behavior.
tests/unit/budget/test_cost_tiers.py Adds unit tests for cost tier definitions, merging, and classification boundaries.
tests/unit/budget/conftest.py Adds budget test factories/fixtures for cost tiers and quota tracking.
src/ai_company/observability/events/quota.py Adds quota-related observability event constants.
src/ai_company/observability/events/budget.py Adds budget event constants for tier resolution and classification misses.
src/ai_company/engine/errors.py Introduces QuotaExhaustedError for provider quota exhaustion.
src/ai_company/engine/init.py Re-exports QuotaExhaustedError from the engine package.
src/ai_company/config/schema.py Extends provider/root config with subscription, degradation, and cost_tiers fields.
src/ai_company/config/defaults.py Adds default cost_tiers stanza to the default config dict.
src/ai_company/budget/quota_tracker.py Implements QuotaTracker service with window rotation + snapshots + logging.
src/ai_company/budget/quota.py Adds quota/subscription/degradation models and helpers (window_start, effective_cost_per_1k).
src/ai_company/budget/enforcer.py Wires quota checks into BudgetEnforcer and adds check_quota() API.
src/ai_company/budget/cost_tiers.py Adds cost tier models, built-in tier set, merge logic, and classifier.
src/ai_company/budget/init.py Re-exports new budget APIs (tiers, quota models, tracker).
README.md Updates milestone feature list to mention tiers + quota/subscription tracking.
DESIGN_SPEC.md Documents newly implemented budget/quota/tier capabilities and event/module structure.
CLAUDE.md Updates repo layout description to include tiers and quota/subscription tracking under budget.
Comments suppressed due to low confidence (1)

src/ai_company/budget/enforcer.py:127

  • except MemoryError, RecursionError: is invalid syntax in Python 3 and will prevent this module from importing. Use tuple exception syntax (and optionally bind the exception) instead, e.g. except (MemoryError, RecursionError): / as exc.
        except BudgetExhaustedError:
            raise
        except MemoryError, RecursionError:  # builtin MemoryError (OOM)
            raise

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +115 to +121
if provider_name not in self._usage:
logger.debug(
QUOTA_USAGE_SKIPPED,
provider=provider_name,
reason="no_subscription_config",
)
return
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QuotaTracker treats providers with a SubscriptionConfig but no quotas the same as unknown providers (provider_name not in self._usage) and logs reason="no_subscription_config". This is misleading in observability; consider distinguishing "unknown provider" from "no quotas configured" (or key _usage off subscriptions and check sub_config.quotas separately).

Copilot uses AI. Check for mistakes.
Comment on lines +3 to +6
Provides configurable metadata for cost tiers: price ranges, display
properties, and model-to-tier classification. The built-in ``CostTier``
enum (``core.enums``) remains for backward compatibility; this module
adds a configurable layer on top.
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module docstring references the built-in CostTier enum as (core.enums), but in this repo the enum lives in ai_company.core.enums. Updating the reference avoids pointing readers to a non-existent module path.

Copilot uses AI. Check for mistakes.
Comment on lines +431 to +435
def test_defaults_to_now(self) -> None:
"""Uses current time when now is not provided."""
result = window_start(QuotaWindow.PER_DAY)
now = datetime.now(UTC)
assert result.day == now.day
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test can be flaky around UTC midnight because it calls window_start() and then separately calls datetime.now(UTC) for comparison; the day could roll over between the two calls. Prefer passing a fixed now= into window_start() (or freezing time) and asserting the full expected value.

Copilot uses AI. Check for mistakes.
DESIGN_SPEC.md Outdated
| **Personality compatibility scoring** | Adopted (M3) | Weighted composite: 60% Big Five similarity (openness, conscientiousness, agreeableness, stress_response → 1−\|diff\|; extraversion → tent-function peaking at 0.3 diff), 20% collaboration alignment (ordinal adjacency: INDEPENDENT↔PAIR↔TEAM), 20% conflict approach (constructive pairs score 1.0, destructive pairs 0.2, mixed 0.4–0.6). `itertools.combinations` for team-level averaging. Result clamped to [0, 1]. | Covers behavioral diversity (extraversion complement), task alignment (conscientiousness similarity), and interpersonal friction (conflict approach). Weights are configurable module constants. |
| **Agent behavior testing** | Planned (M3) | Scripted `FakeProvider` for unit tests (deterministic turn sequences); behavioral outcome assertions for integration tests (task completed, tools called, cost within budget). | Leverages existing `FakeProvider` and `CompletionResponseFactory` fixtures. Precise engine testing without brittle response-matching at integration level. |
| **LLM call analytics** | Adopted (incremental) | M3: proxy metrics (`turns_per_task`, `tokens_per_task`) — adopted. M4 data models: call categorization (`productive`, `coordination`, `system`), category analytics, coordination metrics, orchestration ratio — adopted. M4 runtime collection pipeline and M5+ full analytics: planned. | Append-only, never blocks execution. Builds on existing `CostRecord` infrastructure. Detects orchestration overhead early. See §10.5. |
| **Cost tiers & quota tracking** | Adopted (M5) | Configurable `CostTier` definitions with merge/override semantics via `resolve_tiers(defaults, overrides)`. `SubscriptionConfig` + `QuotaLimit` model per-provider subscription plans. `QuotaTracker` enforces per-provider request/token quotas with window-based rotation. `DegradationConfig` controls behavior when quotas are approached. | Enables cost classification without hardcoding vendor tiers. Quota tracking prevents surprise overages at the provider level. Window-based rotation aligns quota resets with billing periods. See §10.4. |
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DESIGN_SPEC says cost tiers are merged via resolve_tiers(defaults, overrides), but the implementation added in this PR exposes resolve_tiers(config: CostTiersConfig) instead. Update the spec to match the actual API so readers don’t implement against the wrong signature.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-structured set of features for budget management, including configurable cost tiers, provider-level quota tracking, and integration with the existing BudgetEnforcer. The new Pydantic models for quotas and cost tiers are robust, and the QuotaTracker service is well-implemented with attention to concurrency safety. The accompanying tests are comprehensive. I have identified one critical syntax error and one high-severity concern regarding the fail-open behavior of the budget check.

await self._check_provider_quota(agent_id, provider_name)
except BudgetExhaustedError:
raise
except MemoryError, RecursionError: # builtin MemoryError (OOM)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This except statement uses Python 2 syntax (except MemoryError, RecursionError:), which is a SyntaxError in Python 3. To catch multiple exceptions, they should be enclosed in a tuple.

        except (MemoryError, RecursionError):  # builtin MemoryError (OOM)

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/ai_company/budget/enforcer.py (1)

117-134: ⚠️ Potential issue | 🟠 Major

Don't fail open when the quota subsystem itself errors.

This broad fallback now includes _check_provider_quota(). Before quota integration, the allow-on-error path only weakened spend preflight; with quotas inside the same try, a tracker/config bug silently disables provider-cap enforcement altogether.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/budget/enforcer.py` around lines 117 - 134, The current
try/except wraps _check_monthly_hard_stop, _check_daily_limit, and
_check_provider_quota so any exception in _check_provider_quota silently falls
back to allow execution; move _check_provider_quota out of the broad try (or
give it its own tight try that only catches BudgetExhaustedError and
MemoryError/RecursionError) so provider quota subsystem errors are not
swallowed—adjust calls to _check_monthly_hard_stop, _check_daily_limit, and
_check_provider_quota accordingly to ensure only the preflight spend checks may
fall back, while quota errors propagate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@DESIGN_SPEC.md`:
- Line 1697: The spec overstates runtime behavior for
ProviderConfig.degradation: either narrow DegradationConfig documentation to
reflect current behavior (that QuotaExhaustedError still always raises and
FALLBACK/QUEUE routing is not implemented) and update the QuotaExhaustedError
docstring to explicitly state the limitation, or implement the missing runtime
behaviors (FALLBACK/QUEUE auto-downgrade or rejection) to match the spec; in
either case update DESIGN_SPEC.md and the QuotaExhaustedError docstring
consistently and add an explicit note (per coding guidelines) that alerts users
when the implementation deviates from the spec and why.

In `@src/ai_company/budget/enforcer.py`:
- Around line 94-99: check_can_execute currently performs a zero-token quota
check because it calls _check_provider_quota()/check_quota() without passing
through an estimated token count; thread an estimated_tokens argument from
check_can_execute into any calls to _check_provider_quota and ultimately to the
quota check method (check_quota) so the preflight uses the projected usage, and
update the other call sites referenced (the blocks around the other occurrences
of _check_provider_quota / check_quota noted at the other ranges) to pass the
same estimated_tokens value to maintain consistent token-based caps enforcement.

In `@src/ai_company/budget/quota_tracker.py`:
- Around line 88-157: check_quota() and record_usage() have a TOCTOU gap: under
concurrency multiple coroutines can each pass check_quota() then race to
record_usage(), oversubscribing limits. Fix by introducing an atomic admission
method (e.g. try_consume or reserve_and_consume) that acquires self._lock,
computes remaining capacity using the same window logic (window_start via
window_start(...) and current from self._usage[provider_name]), and if capacity
suffices increments the counters and returns success; update callers to use this
new atomic method instead of calling check_quota() then record_usage(), and
leave check_quota() as a pure read-only inspector (or remove it) so all
admission decisions are made via the single locked path in the new method; also
adjust record_usage() so it’s either private for non-admission bookkeeping or
folded into the atomic consume implementation.
- Around line 345-370: The token-path currently treats a request that exactly
reaches max_tokens as exhausted because both _is_window_exhausted and
_build_exhaustion_reason use "projected >= quota.max_tokens"; change those
checks to use a strict ">" comparison (i.e., projected > quota.max_tokens) so a
request that exactly fills the remaining tokens is allowed, and only requests
that would exceed the token limit are denied; update the comparisons in both
_is_window_exhausted (the return expression) and _build_exhaustion_reason (the
token append condition that uses the local projected variable) to reflect this
strict comparison.

In `@src/ai_company/budget/quota.py`:
- Around line 305-347: The window_start function currently reads calendar fields
from the provided now datetime and then tags the result as UTC, which yields
wrong boundaries for non-UTC-aware datetimes; update window_start (and
references to QuotaWindow.PER_MINUTE / PER_HOUR / PER_DAY) to first reject naive
datetimes (raise a ValueError if now.tzinfo is None or now.tzinfo.utcoffset(now)
is None) and then normalize/convert the timestamp to UTC via now =
now.astimezone(UTC) before extracting year/month/day/hour/minute to build the
window-start datetime.

In `@src/ai_company/config/schema.py`:
- Around line 202-209: RootConfig currently accepts
degradation.fallback_providers (a tuple of strings) without verifying they
reference valid provider names, so typos pass config load; update RootConfig
validation to cross-reference each entry in degradation.fallback_providers
against the canonical provider registry/list (the same source used for routing
model refs) and raise a validation error for any unknown provider names.
Implement this as a pydantic validator/root_validator inside RootConfig (or a
dedicated validate_fallback_providers method) that iterates over
RootConfig.degradation.fallback_providers, checks membership in the provider
registry, and produces a clear ValidationError listing invalid names so config
load fails fast.
- Around line 206-209: The default_factory currently builds a DegradationConfig
(which defaults to FALLBACK and triggers a validator warning when
fallback_providers is empty), causing CONFIG_VALIDATION_FAILED noise; fix by
making the default a no-op degradation config instead of the FALLBACK
default—either change the Field to use default=None with
Optional[DegradationConfig] or set default_factory to create an explicit
non-degrading config (e.g. default_factory=lambda:
DegradationConfig(mode=DegradationMode.NONE)); update the degradation Field
declaration and imports to reference DegradationConfig and DegradationMode
accordingly so normal ProviderConfig parsing does not emit warnings when the
user did not opt into degradation.

In `@src/ai_company/engine/errors.py`:
- Around line 41-46: The current QuotaExhaustedError class and call sites make
every quota miss terminal; update the logic that raises QuotaExhaustedError to
first inspect DegradationConfig.action (the degradation routing policy) and only
raise a terminal QuotaExhaustedError when the configured action mandates
termination; for FALLBACK or QUEUE actions, either return a non-terminal
signal/result or raise a different non-terminal exception type so the caller can
perform fallback/queue behavior; ensure references to QuotaExhaustedError and
BudgetExhaustedError remain consistent and update any docs/comments to reflect
that raising QuotaExhaustedError now depends on DegradationConfig.action.

In `@tests/unit/budget/test_cost_tiers.py`:
- Around line 64-72: The test allows zero-width tiers but the classifier uses
the half-open rule (min <= cost < max) so price_range_min == price_range_max
will never match; fix by either rejecting equal bounds during CostTierDefinition
validation (add a check in CostTierDefinition.__post_init__ or the existing
validate_tiers function to raise for price_range_min >= price_range_max) or
change the classifier comparison in the function that does the tier matching
(where it currently uses "min <= cost < max") to make the upper bound inclusive
for the final/only tier (use "min <= cost <= max" for that case or otherwise
ensure a deterministic inclusive rule). Ensure you reference CostTierDefinition
and the classifier function/method in cost_tiers.py when applying the change.
- Around line 302-349: Collapse the duplicated boundary tests for
classify_model_tier into a single `@pytest.mark.parametrize` table (reusing the
same parameterization you added around lines 368-400), replacing the separate
functions test_boundary_low_medium, test_boundary_medium_high,
test_boundary_high_premium and the within-range tests with parametrized cases
that include input cost and expected tier; remove the duplicated standalone
tests (the ones at lines 302-349) and ensure the parameter table also adds a
missing negative-cost regression case (e.g., cost = -0.001 -> expected "low") so
all boundaries and the negative-cost scenario are covered by the single
parametrized test for classify_model_tier.

In `@tests/unit/budget/test_enforcer_quota.py`:
- Around line 45-58: The tests flake because _make_quota_tracker() hardcodes
QuotaWindow.PER_MINUTE causing counters to rotate if the minute rolls over
between operations; update the helper to avoid minute windows by either (a)
change the default window in _make_quota_tracker to a longer-lived window like
QuotaWindow.PER_HOUR or (b) add a window parameter to _make_quota_tracker and
use that in the SubscriptionConfig so tests can pass an hour-long window or a
frozen clock; locate and modify the _make_quota_tracker function and any test
usages (tests around lines 104-132 and 241-259) to use the new default or pass
an explicit longer window, or alternatively freeze QuotaTracker's clock when
constructing it to ensure deterministic behavior.

In `@tests/unit/budget/test_quota_tracker.py`:
- Around line 39-40: The tests are flaky because they rely on real clock
rollovers for PER_MINUTE/PER_DAY windows; update the test helpers and affected
tests to use a longer, stable window (e.g., QuotaWindow.PER_HOUR) so counters
won't reset between immediate calls. Concretely, change the _minute_quota
factory (and any analogous _day_quota helpers) to return
QuotaLimit(window=QuotaWindow.PER_HOUR, ...) and update tests referenced in the
ranges (39-40, 81-109, 163-223, 276-333, 406-430) that call record_usage(),
check_quota(), or snapshot immediately to use those PER_HOUR helpers (leaving
explicit rotation/rollover tests using PER_HOUR as-is) so assertions become
deterministic.

In `@tests/unit/budget/test_quota.py`:
- Around line 431-435: The test test_defaults_to_now calls datetime.now(UTC)
separately from window_start(QuotaWindow.PER_DAY), which can flake at UTC
midnight; fix by capturing the current time once and using it for the assertion:
either call now = datetime.now(UTC) before invoking window_start and pass now
into window_start if it accepts a now parameter, or if window_start has no now
param, obtain before = datetime.now(UTC); result =
window_start(QuotaWindow.PER_DAY); after = datetime.now(UTC) and assert that
result.day is either before.day or after.day to tolerate the boundary. Ensure
references to test_defaults_to_now, window_start, QuotaWindow.PER_DAY, and
datetime.now(UTC) are used to locate the change.

---

Outside diff comments:
In `@src/ai_company/budget/enforcer.py`:
- Around line 117-134: The current try/except wraps _check_monthly_hard_stop,
_check_daily_limit, and _check_provider_quota so any exception in
_check_provider_quota silently falls back to allow execution; move
_check_provider_quota out of the broad try (or give it its own tight try that
only catches BudgetExhaustedError and MemoryError/RecursionError) so provider
quota subsystem errors are not swallowed—adjust calls to
_check_monthly_hard_stop, _check_daily_limit, and _check_provider_quota
accordingly to ensure only the preflight spend checks may fall back, while quota
errors propagate.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e191842a-93c9-427a-b6b7-9ad62d415e2e

📥 Commits

Reviewing files that changed from the base of the PR and between 26b3108 and 02baced.

📒 Files selected for processing (21)
  • CLAUDE.md
  • DESIGN_SPEC.md
  • README.md
  • src/ai_company/budget/__init__.py
  • src/ai_company/budget/cost_tiers.py
  • src/ai_company/budget/enforcer.py
  • src/ai_company/budget/quota.py
  • src/ai_company/budget/quota_tracker.py
  • src/ai_company/config/defaults.py
  • src/ai_company/config/schema.py
  • src/ai_company/engine/__init__.py
  • src/ai_company/engine/errors.py
  • src/ai_company/observability/events/budget.py
  • src/ai_company/observability/events/quota.py
  • tests/unit/budget/conftest.py
  • tests/unit/budget/test_cost_tiers.py
  • tests/unit/budget/test_enforcer_quota.py
  • tests/unit/budget/test_quota.py
  • tests/unit/budget/test_quota_tracker.py
  • tests/unit/config/conftest.py
  • tests/unit/observability/test_events.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Agent
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (5)
!(DESIGN_SPEC.md|.claude/**|**/litellm/**)

📄 CodeRabbit inference engine (CLAUDE.md)

Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Tests must use test-provider, test-small-001, etc.

Files:

  • README.md
  • CLAUDE.md
  • src/ai_company/engine/errors.py
  • src/ai_company/config/defaults.py
  • src/ai_company/observability/events/budget.py
  • tests/unit/budget/test_enforcer_quota.py
  • src/ai_company/observability/events/quota.py
  • src/ai_company/engine/__init__.py
  • src/ai_company/config/schema.py
  • tests/unit/budget/test_cost_tiers.py
  • tests/unit/config/conftest.py
  • tests/unit/budget/test_quota.py
  • src/ai_company/budget/__init__.py
  • tests/unit/observability/test_events.py
  • src/ai_company/budget/cost_tiers.py
  • src/ai_company/budget/quota_tracker.py
  • tests/unit/budget/test_quota_tracker.py
  • tests/unit/budget/conftest.py
  • DESIGN_SPEC.md
  • src/ai_company/budget/quota.py
  • src/ai_company/budget/enforcer.py
**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Always read DESIGN_SPEC.md before implementing any feature or planning any issue — the spec is the mandatory starting point for architecture, data models, and behavior. If implementation deviates from the spec, alert the user and explain why — user decides whether to proceed or update the spec. Do NOT silently diverge. When a spec section is referenced, read that section verbatim. When approved deviations occur, update DESIGN_SPEC.md to reflect the new reality.

Files:

  • README.md
  • CLAUDE.md
  • DESIGN_SPEC.md
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do not use from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations
Use except A, B: syntax without parentheses (not except (A, B):) — PEP 758 except syntax, enforced by ruff on Python 3.14
Add type hints to all public functions and classes — mypy strict mode enforced
Use Google-style docstrings (required on all public classes and functions) — enforced by ruff D rules
Enforce immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement. For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries
Separate config (frozen Pydantic models) from runtime state (mutable-via-copy models using model_copy(update=...)). Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 with adopted conventions: use @computed_field for derived values instead of storing redundant fields; use NotBlankStr (from core.types) for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations (e.g., multiple tool invocations, parallel agent calls) — prefer structured concurrency over bare create_task
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly — never silently swallow exceptions
Validate input at system boundaries (user input, external APIs, config files)
Set line length to 88 characters (ruff configured)

Files:

  • src/ai_company/engine/errors.py
  • src/ai_company/config/defaults.py
  • src/ai_company/observability/events/budget.py
  • tests/unit/budget/test_enforcer_quota.py
  • src/ai_company/observability/events/quota.py
  • src/ai_company/engine/__init__.py
  • src/ai_company/config/schema.py
  • tests/unit/budget/test_cost_tiers.py
  • tests/unit/config/conftest.py
  • tests/unit/budget/test_quota.py
  • src/ai_company/budget/__init__.py
  • tests/unit/observability/test_events.py
  • src/ai_company/budget/cost_tiers.py
  • src/ai_company/budget/quota_tracker.py
  • tests/unit/budget/test_quota_tracker.py
  • tests/unit/budget/conftest.py
  • src/ai_company/budget/quota.py
  • src/ai_company/budget/enforcer.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Every module with business logic must have: from ai_company.observability import get_logger followed by logger = get_logger(__name__) — never use import logging / logging.getLogger() / print()
Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget). Import directly: from ai_company.observability.events.<domain> import EVENT_CONSTANT
Use structured logging with kwargs: always logger.info(EVENT, key=value) — never logger.info("msg %s", val)
Log all error paths at WARNING or ERROR level with context before raising exceptions
Log all state transitions at INFO level
Use DEBUG logging for object creation, internal flow, and entry/exit of key functions

Files:

  • src/ai_company/engine/errors.py
  • src/ai_company/config/defaults.py
  • src/ai_company/observability/events/budget.py
  • src/ai_company/observability/events/quota.py
  • src/ai_company/engine/__init__.py
  • src/ai_company/config/schema.py
  • src/ai_company/budget/__init__.py
  • src/ai_company/budget/cost_tiers.py
  • src/ai_company/budget/quota_tracker.py
  • src/ai_company/budget/quota.py
  • src/ai_company/budget/enforcer.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow to categorize tests
Maintain 80% minimum code coverage (enforced in CI with --cov-fail-under=80)
Use asyncio_mode = "auto" in pytest configuration — no manual @pytest.mark.asyncio needed on async tests
Set test timeout to 30 seconds per test
Use pytest-xdist via -n auto for parallel test execution
Prefer @pytest.mark.parametrize for testing similar cases instead of multiple nearly-identical tests

Files:

  • tests/unit/budget/test_enforcer_quota.py
  • tests/unit/budget/test_cost_tiers.py
  • tests/unit/config/conftest.py
  • tests/unit/budget/test_quota.py
  • tests/unit/observability/test_events.py
  • tests/unit/budget/test_quota_tracker.py
  • tests/unit/budget/conftest.py
🧠 Learnings (1)
📚 Learning: 2026-03-09T10:20:23.072Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-09T10:20:23.072Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • src/ai_company/observability/events/budget.py
  • src/ai_company/observability/events/quota.py
  • tests/unit/observability/test_events.py
  • DESIGN_SPEC.md
  • src/ai_company/budget/enforcer.py
🧬 Code graph analysis (10)
src/ai_company/engine/__init__.py (1)
src/ai_company/engine/errors.py (1)
  • QuotaExhaustedError (41-46)
src/ai_company/config/schema.py (2)
src/ai_company/budget/cost_tiers.py (1)
  • CostTiersConfig (85-116)
src/ai_company/budget/quota.py (2)
  • DegradationConfig (163-201)
  • SubscriptionConfig (77-146)
tests/unit/budget/test_cost_tiers.py (1)
src/ai_company/budget/cost_tiers.py (4)
  • CostTierDefinition (23-82)
  • CostTiersConfig (85-116)
  • classify_model_tier (202-244)
  • resolve_tiers (163-199)
tests/unit/config/conftest.py (2)
src/ai_company/budget/cost_tiers.py (1)
  • CostTiersConfig (85-116)
src/ai_company/budget/quota.py (2)
  • DegradationConfig (163-201)
  • SubscriptionConfig (77-146)
tests/unit/budget/test_quota.py (1)
src/ai_company/budget/quota.py (12)
  • DegradationAction (149-160)
  • DegradationConfig (163-201)
  • ProviderCostModel (62-74)
  • QuotaCheckResult (273-302)
  • QuotaLimit (30-59)
  • QuotaWindow (21-27)
  • SubscriptionConfig (77-146)
  • effective_cost_per_1k (350-370)
  • window_start (305-347)
  • requests_remaining (242-250)
  • tokens_remaining (254-262)
  • is_exhausted (266-270)
src/ai_company/budget/cost_tiers.py (1)
src/ai_company/observability/_logger.py (1)
  • get_logger (8-28)
src/ai_company/budget/quota_tracker.py (3)
src/ai_company/budget/quota.py (5)
  • QuotaCheckResult (273-302)
  • QuotaLimit (30-59)
  • QuotaSnapshot (204-270)
  • QuotaWindow (21-27)
  • window_start (305-347)
src/ai_company/observability/_logger.py (1)
  • get_logger (8-28)
src/ai_company/budget/enforcer.py (1)
  • check_quota (142-170)
tests/unit/budget/conftest.py (4)
src/ai_company/budget/cost_tiers.py (2)
  • CostTierDefinition (23-82)
  • CostTiersConfig (85-116)
src/ai_company/budget/enums.py (1)
  • BudgetAlertLevel (6-16)
src/ai_company/budget/quota.py (3)
  • QuotaLimit (30-59)
  • QuotaWindow (21-27)
  • SubscriptionConfig (77-146)
src/ai_company/budget/quota_tracker.py (1)
  • QuotaTracker (49-342)
src/ai_company/budget/quota.py (1)
src/ai_company/observability/_logger.py (1)
  • get_logger (8-28)
src/ai_company/budget/enforcer.py (3)
src/ai_company/budget/quota.py (1)
  • QuotaCheckResult (273-302)
src/ai_company/engine/errors.py (2)
  • BudgetExhaustedError (24-34)
  • QuotaExhaustedError (41-46)
src/ai_company/budget/quota_tracker.py (2)
  • QuotaTracker (49-342)
  • check_quota (159-252)
🪛 LanguageTool
README.md

[typographical] ~26-~26: To join two clauses or introduce examples, consider using an em dash.
Context: ...n failures - Budget Enforcement (M5) - BudgetEnforcer service with pre-flight...

(DASH_RULE)

🔇 Additional comments (4)
src/ai_company/observability/events/budget.py (1)

30-31: LGTM.

The new tier event constants fit the existing naming scheme and give the cost-tier path dedicated observability hooks.

src/ai_company/observability/events/quota.py (1)

5-11: LGTM.

The quota event surface is small, specific, and matches the tracker/check operations added in this PR.

tests/unit/budget/conftest.py (1)

108-126: LGTM.

The new factories and make_quota_tracker() helper centralize quota/cost-tier setup and keep the tests on generic provider/model identifiers.

Also applies to: 266-277

src/ai_company/budget/__init__.py (1)

32-57: LGTM.

The package exports stay in sync with the new cost-tier and quota modules, which makes the public budget surface coherent from ai_company.budget.

Also applies to: 67-112

Comment on lines +88 to +157
async def record_usage(
self,
provider_name: str,
*,
requests: int = 1,
tokens: int = 0,
) -> None:
"""Record usage against all configured windows for a provider.

Rotates window counters if a window boundary has been crossed.
Providers with no subscription config are skipped with a DEBUG log.

Args:
provider_name: Provider to record usage for.
requests: Number of requests to record (must be >= 0).
tokens: Number of tokens to record (must be >= 0).

Raises:
ValueError: If requests or tokens is negative.
"""
if requests < 0:
msg = f"requests must be non-negative, got {requests}"
raise ValueError(msg)
if tokens < 0:
msg = f"tokens must be non-negative, got {tokens}"
raise ValueError(msg)

if provider_name not in self._usage:
logger.debug(
QUOTA_USAGE_SKIPPED,
provider=provider_name,
reason="no_subscription_config",
)
return

async with self._lock:
now = datetime.now(UTC)
provider_usage = self._usage[provider_name]

for window_type in list(provider_usage):
current = provider_usage[window_type]
expected_start = window_start(window_type, now=now)

if expected_start != current.window_start:
# Window boundary crossed — rotate
provider_usage[window_type] = _WindowUsage(
requests=requests,
tokens=tokens,
window_start=expected_start,
)
logger.debug(
QUOTA_WINDOW_ROTATED,
provider=provider_name,
window=window_type.value,
old_start=str(current.window_start),
new_start=str(expected_start),
)
else:
provider_usage[window_type] = _WindowUsage(
requests=current.requests + requests,
tokens=current.tokens + tokens,
window_start=current.window_start,
)

logger.debug(
QUOTA_USAGE_RECORDED,
provider=provider_name,
requests=requests,
tokens=tokens,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

check_quota() and record_usage() still have a TOCTOU gap.

These are separate locked operations. Under load, multiple coroutines can all observe the same remaining capacity, all pass check_quota(), and only later increment the counters in record_usage(). That allows the tracker to oversubscribe a window even though each individual call looked safe. If this is meant to enforce provider caps, the admission step needs an atomic reserve/consume path under one lock.

Also applies to: 159-243

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/budget/quota_tracker.py` around lines 88 - 157, check_quota()
and record_usage() have a TOCTOU gap: under concurrency multiple coroutines can
each pass check_quota() then race to record_usage(), oversubscribing limits. Fix
by introducing an atomic admission method (e.g. try_consume or
reserve_and_consume) that acquires self._lock, computes remaining capacity using
the same window logic (window_start via window_start(...) and current from
self._usage[provider_name]), and if capacity suffices increments the counters
and returns success; update callers to use this new atomic method instead of
calling check_quota() then record_usage(), and leave check_quota() as a pure
read-only inspector (or remove it) so all admission decisions are made via the
single locked path in the new method; also adjust record_usage() so it’s either
private for non-admission bookkeeping or folded into the atomic consume
implementation.

- Change DegradationConfig default strategy from FALLBACK to ALERT
- Return None (not 0) from requests_remaining/tokens_remaining for unlimited
- Add UTC validation to window_start() — reject naive datetimes
- Thread estimated_tokens through check_can_execute → _check_provider_quota
- Reject zero-width cost tiers (min == max with finite max)
- Add RootConfig cross-validation of degradation fallback_providers
- Change token exhaustion check from >= to > (allow exact-fill)
- Distinguish "unknown provider" vs "no quotas" in QuotaTracker logs
- Add TOCTOU gap documentation to QuotaTracker class docstring
- Document error-handling asymmetry in BudgetEnforcer.check_quota
- Add allow_inf_nan=False to QuotaCheckResult model config
- Add logging before raise in model validators and input validation
- Fix module docstring reference (core.enums → ai_company.core.enums)
- Fix enum docstrings: Attributes → Members for StrEnum classes
- Fix DESIGN_SPEC: resolve_tiers signature, CostTierDefinition name,
  degradation implementation status
- Collapse duplicate boundary tests into parametrize table
- Add negative-cost test, naive-datetime test, fix midnight race
- Change test helpers from PER_MINUTE to PER_HOUR (avoid flakiness)
- Strengthen test_provider_without_quotas_not_tracked assertions
@Aureliolo Aureliolo merged commit 9baedfa into main Mar 9, 2026
8 of 9 checks passed
@Aureliolo Aureliolo deleted the feat/cost-tiers branch March 9, 2026 12:08
Comment on lines +162 to +168
logger.debug(
QUOTA_WINDOW_ROTATED,
provider=provider_name,
window=window_type.value,
old_start=str(current.window_start),
new_start=str(expected_start),
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Window rotation logged at DEBUG, violating the INFO-for-state-transitions convention

CLAUDE.md states: "All state transitions must log at INFO". A quota window rotation is a meaningful state transition — it resets accumulated counters, directly affecting subsequent enforcement decisions. Logging it at DEBUG means ops teams running at INFO level would miss quota resets entirely.

QUOTA_CHECK_DENIED is correctly logged at INFO; the rotation that un-blocks a provider is equally significant and should match:

                    logger.info(
                        QUOTA_WINDOW_ROTATED,
                        provider=provider_name,
                        window=window_type.value,
                        old_start=str(current.window_start),
                        new_start=str(expected_start),
                    )

Rule Used: CLAUDE.md (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/budget/quota_tracker.py
Line: 162-168

Comment:
**Window rotation logged at `DEBUG`, violating the INFO-for-state-transitions convention**

`CLAUDE.md` states: *"All state transitions must log at INFO"*. A quota window rotation is a meaningful state transition — it resets accumulated counters, directly affecting subsequent enforcement decisions. Logging it at `DEBUG` means ops teams running at `INFO` level would miss quota resets entirely.

`QUOTA_CHECK_DENIED` is correctly logged at `INFO`; the rotation that un-blocks a provider is equally significant and should match:

```python
                    logger.info(
                        QUOTA_WINDOW_ROTATED,
                        provider=provider_name,
                        window=window_type.value,
                        old_start=str(current.window_start),
                        new_start=str(expected_start),
                    )
```

**Rule Used:** CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=6816cd03-d0e1-4fd0-9d04-2417487a584c))

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +377 to +397
def effective_cost_per_1k(
cost_per_1k_input: float,
cost_per_1k_output: float,
cost_model: ProviderCostModel,
) -> float:
"""Compute effective cost per 1k tokens based on cost model.

Returns 0.0 for SUBSCRIPTION and LOCAL models (pre-paid / free).
Returns ``cost_per_1k_input + cost_per_1k_output`` for PER_TOKEN.

Args:
cost_per_1k_input: Cost per 1k input tokens.
cost_per_1k_output: Cost per 1k output tokens.
cost_model: The provider's cost model.

Returns:
Effective cost per 1k tokens.
"""
if cost_model in (ProviderCostModel.SUBSCRIPTION, ProviderCostModel.LOCAL):
return 0.0
return cost_per_1k_input + cost_per_1k_output
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

effective_cost_per_1k silently accepts and propagates negative cost components

For PER_TOKEN models, the function returns cost_per_1k_input + cost_per_1k_output without any validation. A negative input cost produces a negative-or-reduced total, which classify_model_tier handles by returning None (only catches cost_per_1k_total < 0) — but a partially-negative combination (e.g., -0.001 + 0.005 = 0.004) would silently classify as "medium" even though a negative input cost has no valid domain meaning. This is explicitly tested and documented as "returns the sum as-is", but the function offers no guard and no docstring note about this edge case, which could surprise callers who feed unchecked provider config values.

Consider at minimum adding a note to the docstring, or a log warning (matching CLAUDE.md's "All error paths must log at WARNING") if either component is negative:

    if cost_per_1k_input < 0 or cost_per_1k_output < 0:
        logger.warning(
            BUDGET_TIER_CLASSIFY_MISS,
            cost_per_1k_input=cost_per_1k_input,
            cost_per_1k_output=cost_per_1k_output,
            reason="negative_cost_component",
        )
    return cost_per_1k_input + cost_per_1k_output
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/budget/quota.py
Line: 377-397

Comment:
**`effective_cost_per_1k` silently accepts and propagates negative cost components**

For `PER_TOKEN` models, the function returns `cost_per_1k_input + cost_per_1k_output` without any validation. A negative input cost produces a negative-or-reduced total, which `classify_model_tier` handles by returning `None` (only catches `cost_per_1k_total < 0`) — but a partially-negative combination (e.g., `-0.001 + 0.005 = 0.004`) would silently classify as `"medium"` even though a negative input cost has no valid domain meaning. This is explicitly tested and documented as "returns the sum as-is", but the function offers no guard and no docstring note about this edge case, which could surprise callers who feed unchecked provider config values.

Consider at minimum adding a note to the docstring, or a log warning (matching CLAUDE.md's "All error paths must log at WARNING") if either component is negative:

```python
    if cost_per_1k_input < 0 or cost_per_1k_output < 0:
        logger.warning(
            BUDGET_TIER_CLASSIFY_MISS,
            cost_per_1k_input=cost_per_1k_input,
            cost_per_1k_output=cost_per_1k_output,
            reason="negative_cost_component",
        )
    return cost_per_1k_input + cost_per_1k_output
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +278 to +282
def is_exhausted(self) -> bool:
"""Whether any enforced limit in this window is exhausted."""
if self.requests_limit > 0 and self.requests_used >= self.requests_limit:
return True
return self.tokens_limit > 0 and self.tokens_used >= self.tokens_limit
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_exhausted and _is_window_exhausted disagree on the token-exact-fill boundary

QuotaSnapshot.is_exhausted considers the window exhausted when tokens_used >= tokens_limit — "at-limit" equals exhausted. However, the enforcement predicate _is_window_exhausted in quota_tracker.py uses usage.tokens + estimated_tokens > quota.max_tokens — "at-limit with no projected tokens" is not exhausted.

When all tokens have been consumed and a check is made with no estimated projection, the snapshot reports is_exhausted=True but QuotaTracker.check_quota returns allowed=True. A caller consuming the snapshot API would reasonably expect these two signals to agree.

This is especially relevant since check_can_execute calls _check_provider_quota without forwarding estimated_tokens (it defaults to zero), meaning the at-limit state would pass the pre-flight check even though the snapshot already shows exhaustion.

Consider documenting the divergence in the is_exhausted docstring (explaining it is a conservative display signal, not the enforcement predicate), or aligning the two by switching is_exhausted to use a strict > comparison for the token check to match _is_window_exhausted.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/budget/quota.py
Line: 278-282

Comment:
**`is_exhausted` and `_is_window_exhausted` disagree on the token-exact-fill boundary**

`QuotaSnapshot.is_exhausted` considers the window exhausted when `tokens_used >= tokens_limit` — "at-limit" equals exhausted. However, the enforcement predicate `_is_window_exhausted` in `quota_tracker.py` uses `usage.tokens + estimated_tokens > quota.max_tokens` — "at-limit with no projected tokens" is **not** exhausted.

When all tokens have been consumed and a check is made with no estimated projection, the snapshot reports `is_exhausted=True` but `QuotaTracker.check_quota` returns `allowed=True`. A caller consuming the snapshot API would reasonably expect these two signals to agree.

This is especially relevant since `check_can_execute` calls `_check_provider_quota` without forwarding `estimated_tokens` (it defaults to zero), meaning the at-limit state would pass the pre-flight check even though the snapshot already shows exhaustion.

Consider documenting the divergence in the `is_exhausted` docstring (explaining it is a conservative display signal, not the enforcement predicate), or aligning the two by switching `is_exhausted` to use a strict `>` comparison for the token check to match `_is_window_exhausted`.

How can I resolve this? If you propose a fix, please make it concise.

Aureliolo added a commit that referenced this pull request Mar 10, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.1.1](ai-company-v0.1.0...ai-company-v0.1.1)
(2026-03-10)


### Features

* add autonomy levels and approval timeout policies
([#42](#42),
[#126](#126))
([#197](#197))
([eecc25a](eecc25a))
* add CFO cost optimization service with anomaly detection, reports, and
approval decisions
([#186](#186))
([a7fa00b](a7fa00b))
* add code quality toolchain (ruff, mypy, pre-commit, dependabot)
([#63](#63))
([36681a8](36681a8))
* add configurable cost tiers and subscription/quota-aware tracking
([#67](#67))
([#185](#185))
([9baedfa](9baedfa))
* add container packaging, Docker Compose, and CI pipeline
([#269](#269))
([435bdfe](435bdfe)),
closes [#267](#267)
* add coordination error taxonomy classification pipeline
([#146](#146))
([#181](#181))
([70c7480](70c7480))
* add cost-optimized, hierarchical, and auction assignment strategies
([#175](#175))
([ce924fa](ce924fa)),
closes [#173](#173)
* add design specification, license, and project setup
([8669a09](8669a09))
* add env var substitution and config file auto-discovery
([#77](#77))
([7f53832](7f53832))
* add FastestStrategy routing + vendor-agnostic cleanup
([#140](#140))
([09619cb](09619cb)),
closes [#139](#139)
* add HR engine and performance tracking
([#45](#45),
[#47](#47))
([#193](#193))
([2d091ea](2d091ea))
* add issue auto-search and resolution verification to PR review skill
([#119](#119))
([deecc39](deecc39))
* add memory retrieval, ranking, and context injection pipeline
([#41](#41))
([873b0aa](873b0aa))
* add pluggable MemoryBackend protocol with models, config, and events
([#180](#180))
([46cfdd4](46cfdd4))
* add pluggable MemoryBackend protocol with models, config, and events
([#32](#32))
([46cfdd4](46cfdd4))
* add pluggable PersistenceBackend protocol with SQLite implementation
([#36](#36))
([f753779](f753779))
* add progressive trust and promotion/demotion subsystems
([#43](#43),
[#49](#49))
([3a87c08](3a87c08))
* add retry handler, rate limiter, and provider resilience
([#100](#100))
([b890545](b890545))
* add SecOps security agent with rule engine, audit log, and ToolInvoker
integration ([#40](#40))
([83b7b6c](83b7b6c))
* add shared org memory and memory consolidation/archival
([#125](#125),
[#48](#48))
([4a0832b](4a0832b))
* design unified provider interface
([#86](#86))
([3e23d64](3e23d64))
* expand template presets, rosters, and add inheritance
([#80](#80),
[#81](#81),
[#84](#84))
([15a9134](15a9134))
* implement agent runtime state vs immutable config split
([#115](#115))
([4cb1ca5](4cb1ca5))
* implement AgentEngine core orchestrator
([#11](#11))
([#143](#143))
([f2eb73a](f2eb73a))
* implement basic tool system (registry, invocation, results)
([#15](#15))
([c51068b](c51068b))
* implement built-in file system tools
([#18](#18))
([325ef98](325ef98))
* implement communication foundation — message bus, dispatcher, and
messenger ([#157](#157))
([8e71bfd](8e71bfd))
* implement company template system with 7 built-in presets
([#85](#85))
([cbf1496](cbf1496))
* implement conflict resolution protocol
([#122](#122))
([#166](#166))
([e03f9f2](e03f9f2))
* implement core entity and role system models
([#69](#69))
([acf9801](acf9801))
* implement crash recovery with fail-and-reassign strategy
([#149](#149))
([e6e91ed](e6e91ed))
* implement engine extensions — Plan-and-Execute loop and call
categorization
([#134](#134),
[#135](#135))
([#159](#159))
([9b2699f](9b2699f))
* implement enterprise logging system with structlog
([#73](#73))
([2f787e5](2f787e5))
* implement graceful shutdown with cooperative timeout strategy
([#130](#130))
([6592515](6592515))
* implement hierarchical delegation and loop prevention
([#12](#12),
[#17](#17))
([6be60b6](6be60b6))
* implement LiteLLM driver and provider registry
([#88](#88))
([ae3f18b](ae3f18b)),
closes [#4](#4)
* implement LLM decomposition strategy and workspace isolation
([#174](#174))
([aa0eefe](aa0eefe))
* implement meeting protocol system
([#123](#123))
([ee7caca](ee7caca))
* implement message and communication domain models
([#74](#74))
([560a5d2](560a5d2))
* implement model routing engine
([#99](#99))
([d3c250b](d3c250b))
* implement parallel agent execution
([#22](#22))
([#161](#161))
([65940b3](65940b3))
* implement per-call cost tracking service
([#7](#7))
([#102](#102))
([c4f1f1c](c4f1f1c))
* implement personality injection and system prompt construction
([#105](#105))
([934dd85](934dd85))
* implement single-task execution lifecycle
([#21](#21))
([#144](#144))
([c7e64e4](c7e64e4))
* implement subprocess sandbox for tool execution isolation
([#131](#131))
([#153](#153))
([3c8394e](3c8394e))
* implement task assignment subsystem with pluggable strategies
([#172](#172))
([c7f1b26](c7f1b26)),
closes [#26](#26)
[#30](#30)
* implement task decomposition and routing engine
([#14](#14))
([9c7fb52](9c7fb52))
* implement Task, Project, Artifact, Budget, and Cost domain models
([#71](#71))
([81eabf1](81eabf1))
* implement tool permission checking
([#16](#16))
([833c190](833c190))
* implement YAML config loader with Pydantic validation
([#59](#59))
([ff3a2ba](ff3a2ba))
* implement YAML config loader with Pydantic validation
([#75](#75))
([ff3a2ba](ff3a2ba))
* initialize project with uv, hatchling, and src layout
([39005f9](39005f9))
* initialize project with uv, hatchling, and src layout
([#62](#62))
([39005f9](39005f9))
* Litestar REST API, WebSocket feed, and approval queue (M6)
([#189](#189))
([29fcd08](29fcd08))
* make TokenUsage.total_tokens a computed field
([#118](#118))
([c0bab18](c0bab18)),
closes [#109](#109)
* parallel tool execution in ToolInvoker.invoke_all
([#137](#137))
([58517ee](58517ee))
* testing framework, CI pipeline, and M0 gap fixes
([#64](#64))
([f581749](f581749))
* wire all modules into observability system
([#97](#97))
([f7a0617](f7a0617))


### Bug Fixes

* address Greptile post-merge review findings from PRs
[#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175)
([#176](#176))
([c5ca929](c5ca929))
* address post-merge review feedback from PRs
[#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167)
([#170](#170))
([3bf897a](3bf897a)),
closes [#169](#169)
* enforce strict mypy on test files
([#89](#89))
([aeeff8c](aeeff8c))
* harden Docker sandbox, MCP bridge, and code runner
([#50](#50),
[#53](#53))
([d5e1b6e](d5e1b6e))
* harden git tools security + code quality improvements
([#150](#150))
([000a325](000a325))
* harden subprocess cleanup, env filtering, and shutdown resilience
([#155](#155))
([d1fe1fb](d1fe1fb))
* incorporate post-merge feedback + pre-PR review fixes
([#164](#164))
([c02832a](c02832a))
* pre-PR review fixes for post-merge findings
([#183](#183))
([26b3108](26b3108))
* strengthen immutability for BaseTool schema and ToolInvoker boundaries
([#117](#117))
([7e5e861](7e5e861))


### Performance

* harden non-inferable principle implementation
([#195](#195))
([02b5f4e](02b5f4e)),
closes [#188](#188)


### Refactoring

* adopt NotBlankStr across all models
([#108](#108))
([#120](#120))
([ef89b90](ef89b90))
* extract _SpendingTotals base class from spending summary models
([#111](#111))
([2f39c1b](2f39c1b))
* harden BudgetEnforcer with error handling, validation extraction, and
review fixes
([#182](#182))
([c107bf9](c107bf9))
* harden personality profiles, department validation, and template
rendering ([#158](#158))
([10b2299](10b2299))
* pre-PR review improvements for ExecutionLoop + ReAct loop
([#124](#124))
([8dfb3c0](8dfb3c0))
* split events.py into per-domain event modules
([#136](#136))
([e9cba89](e9cba89))


### Documentation

* add ADR-001 memory layer evaluation and selection
([#178](#178))
([db3026f](db3026f)),
closes [#39](#39)
* add agent scaling research findings to DESIGN_SPEC
([#145](#145))
([57e487b](57e487b))
* add CLAUDE.md, contributing guide, and dev documentation
([#65](#65))
([55c1025](55c1025)),
closes [#54](#54)
* add crash recovery, sandboxing, analytics, and testing decisions
([#127](#127))
([5c11595](5c11595))
* address external review feedback with MVP scope and new protocols
([#128](#128))
([3b30b9a](3b30b9a))
* expand design spec with pluggable strategy protocols
([#121](#121))
([6832db6](6832db6))
* finalize 23 design decisions (ADR-002)
([#190](#190))
([8c39742](8c39742))
* update project docs for M2.5 conventions and add docs-consistency
review agent
([#114](#114))
([99766ee](99766ee))


### Tests

* add e2e single agent integration tests
([#24](#24))
([#156](#156))
([f566fb4](f566fb4))
* add provider adapter integration tests
([#90](#90))
([40a61f4](40a61f4))


### CI/CD

* add Release Please for automated versioning and GitHub Releases
([#278](#278))
([a488758](a488758))
* bump actions/checkout from 4 to 6
([#95](#95))
([1897247](1897247))
* bump actions/upload-artifact from 4 to 7
([#94](#94))
([27b1517](27b1517))
* harden CI/CD pipeline
([#92](#92))
([ce4693c](ce4693c))
* split vulnerability scans into critical-fail and high-warn tiers
([#277](#277))
([aba48af](aba48af))


### Maintenance

* add /worktree skill for parallel worktree management
([#171](#171))
([951e337](951e337))
* add design spec context loading to research-link skill
([8ef9685](8ef9685))
* add post-merge-cleanup skill
([#70](#70))
([f913705](f913705))
* add pre-pr-review skill and update CLAUDE.md
([#103](#103))
([92e9023](92e9023))
* add research-link skill and rename skill files to SKILL.md
([#101](#101))
([651c577](651c577))
* bump aiosqlite from 0.21.0 to 0.22.1
([#191](#191))
([3274a86](3274a86))
* bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group
([#96](#96))
([0338d0c](0338d0c))
* bump ruff from 0.15.4 to 0.15.5
([a49ee46](a49ee46))
* fix M0 audit items
([#66](#66))
([c7724b5](c7724b5))
* pin setup-uv action to full SHA
([#281](#281))
([4448002](4448002))
* post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests,
hookify rules
([#148](#148))
([c57a6a9](c57a6a9))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Aureliolo added a commit that referenced this pull request Mar 11, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.1.0](v0.0.0...v0.1.0)
(2026-03-11)


### Features

* add autonomy levels and approval timeout policies
([#42](#42),
[#126](#126))
([#197](#197))
([eecc25a](eecc25a))
* add CFO cost optimization service with anomaly detection, reports, and
approval decisions
([#186](#186))
([a7fa00b](a7fa00b))
* add code quality toolchain (ruff, mypy, pre-commit, dependabot)
([#63](#63))
([36681a8](36681a8))
* add configurable cost tiers and subscription/quota-aware tracking
([#67](#67))
([#185](#185))
([9baedfa](9baedfa))
* add container packaging, Docker Compose, and CI pipeline
([#269](#269))
([435bdfe](435bdfe)),
closes [#267](#267)
* add coordination error taxonomy classification pipeline
([#146](#146))
([#181](#181))
([70c7480](70c7480))
* add cost-optimized, hierarchical, and auction assignment strategies
([#175](#175))
([ce924fa](ce924fa)),
closes [#173](#173)
* add design specification, license, and project setup
([8669a09](8669a09))
* add env var substitution and config file auto-discovery
([#77](#77))
([7f53832](7f53832))
* add FastestStrategy routing + vendor-agnostic cleanup
([#140](#140))
([09619cb](09619cb)),
closes [#139](#139)
* add HR engine and performance tracking
([#45](#45),
[#47](#47))
([#193](#193))
([2d091ea](2d091ea))
* add issue auto-search and resolution verification to PR review skill
([#119](#119))
([deecc39](deecc39))
* add mandatory JWT + API key authentication
([#256](#256))
([c279cfe](c279cfe))
* add memory retrieval, ranking, and context injection pipeline
([#41](#41))
([873b0aa](873b0aa))
* add pluggable MemoryBackend protocol with models, config, and events
([#180](#180))
([46cfdd4](46cfdd4))
* add pluggable MemoryBackend protocol with models, config, and events
([#32](#32))
([46cfdd4](46cfdd4))
* add pluggable output scan response policies
([#263](#263))
([b9907e8](b9907e8))
* add pluggable PersistenceBackend protocol with SQLite implementation
([#36](#36))
([f753779](f753779))
* add progressive trust and promotion/demotion subsystems
([#43](#43),
[#49](#49))
([3a87c08](3a87c08))
* add retry handler, rate limiter, and provider resilience
([#100](#100))
([b890545](b890545))
* add SecOps security agent with rule engine, audit log, and ToolInvoker
integration ([#40](#40))
([83b7b6c](83b7b6c))
* add shared org memory and memory consolidation/archival
([#125](#125),
[#48](#48))
([4a0832b](4a0832b))
* design unified provider interface
([#86](#86))
([3e23d64](3e23d64))
* expand template presets, rosters, and add inheritance
([#80](#80),
[#81](#81),
[#84](#84))
([15a9134](15a9134))
* implement agent runtime state vs immutable config split
([#115](#115))
([4cb1ca5](4cb1ca5))
* implement AgentEngine core orchestrator
([#11](#11))
([#143](#143))
([f2eb73a](f2eb73a))
* implement AuditRepository for security audit log persistence
([#279](#279))
([94bc29f](94bc29f))
* implement basic tool system (registry, invocation, results)
([#15](#15))
([c51068b](c51068b))
* implement built-in file system tools
([#18](#18))
([325ef98](325ef98))
* implement communication foundation — message bus, dispatcher, and
messenger ([#157](#157))
([8e71bfd](8e71bfd))
* implement company template system with 7 built-in presets
([#85](#85))
([cbf1496](cbf1496))
* implement conflict resolution protocol
([#122](#122))
([#166](#166))
([e03f9f2](e03f9f2))
* implement core entity and role system models
([#69](#69))
([acf9801](acf9801))
* implement crash recovery with fail-and-reassign strategy
([#149](#149))
([e6e91ed](e6e91ed))
* implement engine extensions — Plan-and-Execute loop and call
categorization
([#134](#134),
[#135](#135))
([#159](#159))
([9b2699f](9b2699f))
* implement enterprise logging system with structlog
([#73](#73))
([2f787e5](2f787e5))
* implement graceful shutdown with cooperative timeout strategy
([#130](#130))
([6592515](6592515))
* implement hierarchical delegation and loop prevention
([#12](#12),
[#17](#17))
([6be60b6](6be60b6))
* implement LiteLLM driver and provider registry
([#88](#88))
([ae3f18b](ae3f18b)),
closes [#4](#4)
* implement LLM decomposition strategy and workspace isolation
([#174](#174))
([aa0eefe](aa0eefe))
* implement meeting protocol system
([#123](#123))
([ee7caca](ee7caca))
* implement message and communication domain models
([#74](#74))
([560a5d2](560a5d2))
* implement model routing engine
([#99](#99))
([d3c250b](d3c250b))
* implement parallel agent execution
([#22](#22))
([#161](#161))
([65940b3](65940b3))
* implement per-call cost tracking service
([#7](#7))
([#102](#102))
([c4f1f1c](c4f1f1c))
* implement personality injection and system prompt construction
([#105](#105))
([934dd85](934dd85))
* implement single-task execution lifecycle
([#21](#21))
([#144](#144))
([c7e64e4](c7e64e4))
* implement subprocess sandbox for tool execution isolation
([#131](#131))
([#153](#153))
([3c8394e](3c8394e))
* implement task assignment subsystem with pluggable strategies
([#172](#172))
([c7f1b26](c7f1b26)),
closes [#26](#26)
[#30](#30)
* implement task decomposition and routing engine
([#14](#14))
([9c7fb52](9c7fb52))
* implement Task, Project, Artifact, Budget, and Cost domain models
([#71](#71))
([81eabf1](81eabf1))
* implement tool permission checking
([#16](#16))
([833c190](833c190))
* implement YAML config loader with Pydantic validation
([#59](#59))
([ff3a2ba](ff3a2ba))
* implement YAML config loader with Pydantic validation
([#75](#75))
([ff3a2ba](ff3a2ba))
* initialize project with uv, hatchling, and src layout
([39005f9](39005f9))
* initialize project with uv, hatchling, and src layout
([#62](#62))
([39005f9](39005f9))
* Litestar REST API, WebSocket feed, and approval queue (M6)
([#189](#189))
([29fcd08](29fcd08))
* make TokenUsage.total_tokens a computed field
([#118](#118))
([c0bab18](c0bab18)),
closes [#109](#109)
* parallel tool execution in ToolInvoker.invoke_all
([#137](#137))
([58517ee](58517ee))
* testing framework, CI pipeline, and M0 gap fixes
([#64](#64))
([f581749](f581749))
* wire all modules into observability system
([#97](#97))
([f7a0617](f7a0617))


### Bug Fixes

* address Greptile post-merge review findings from PRs
[#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175)
([#176](#176))
([c5ca929](c5ca929))
* address post-merge review feedback from PRs
[#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167)
([#170](#170))
([3bf897a](3bf897a)),
closes [#169](#169)
* enforce strict mypy on test files
([#89](#89))
([aeeff8c](aeeff8c))
* harden Docker sandbox, MCP bridge, and code runner
([#50](#50),
[#53](#53))
([d5e1b6e](d5e1b6e))
* harden git tools security + code quality improvements
([#150](#150))
([000a325](000a325))
* harden subprocess cleanup, env filtering, and shutdown resilience
([#155](#155))
([d1fe1fb](d1fe1fb))
* incorporate post-merge feedback + pre-PR review fixes
([#164](#164))
([c02832a](c02832a))
* pre-PR review fixes for post-merge findings
([#183](#183))
([26b3108](26b3108))
* resolve circular imports, bump litellm, fix release tag format
([#286](#286))
([a6659b5](a6659b5))
* strengthen immutability for BaseTool schema and ToolInvoker boundaries
([#117](#117))
([7e5e861](7e5e861))


### Performance

* harden non-inferable principle implementation
([#195](#195))
([02b5f4e](02b5f4e)),
closes [#188](#188)


### Refactoring

* adopt NotBlankStr across all models
([#108](#108))
([#120](#120))
([ef89b90](ef89b90))
* extract _SpendingTotals base class from spending summary models
([#111](#111))
([2f39c1b](2f39c1b))
* harden BudgetEnforcer with error handling, validation extraction, and
review fixes
([#182](#182))
([c107bf9](c107bf9))
* harden personality profiles, department validation, and template
rendering ([#158](#158))
([10b2299](10b2299))
* pre-PR review improvements for ExecutionLoop + ReAct loop
([#124](#124))
([8dfb3c0](8dfb3c0))
* split events.py into per-domain event modules
([#136](#136))
([e9cba89](e9cba89))


### Documentation

* add ADR-001 memory layer evaluation and selection
([#178](#178))
([db3026f](db3026f)),
closes [#39](#39)
* add agent scaling research findings to DESIGN_SPEC
([#145](#145))
([57e487b](57e487b))
* add CLAUDE.md, contributing guide, and dev documentation
([#65](#65))
([55c1025](55c1025)),
closes [#54](#54)
* add crash recovery, sandboxing, analytics, and testing decisions
([#127](#127))
([5c11595](5c11595))
* address external review feedback with MVP scope and new protocols
([#128](#128))
([3b30b9a](3b30b9a))
* expand design spec with pluggable strategy protocols
([#121](#121))
([6832db6](6832db6))
* finalize 23 design decisions (ADR-002)
([#190](#190))
([8c39742](8c39742))
* update project docs for M2.5 conventions and add docs-consistency
review agent
([#114](#114))
([99766ee](99766ee))


### Tests

* add e2e single agent integration tests
([#24](#24))
([#156](#156))
([f566fb4](f566fb4))
* add provider adapter integration tests
([#90](#90))
([40a61f4](40a61f4))


### CI/CD

* add Release Please for automated versioning and GitHub Releases
([#278](#278))
([a488758](a488758))
* bump actions/checkout from 4 to 6
([#95](#95))
([1897247](1897247))
* bump actions/upload-artifact from 4 to 7
([#94](#94))
([27b1517](27b1517))
* bump anchore/scan-action from 6.5.1 to 7.3.2
([#271](#271))
([80a1c15](80a1c15))
* bump docker/build-push-action from 6.19.2 to 7.0.0
([#273](#273))
([dd0219e](dd0219e))
* bump docker/login-action from 3.7.0 to 4.0.0
([#272](#272))
([33d6238](33d6238))
* bump docker/metadata-action from 5.10.0 to 6.0.0
([#270](#270))
([baee04e](baee04e))
* bump docker/setup-buildx-action from 3.12.0 to 4.0.0
([#274](#274))
([5fc06f7](5fc06f7))
* bump sigstore/cosign-installer from 3.9.1 to 4.1.0
([#275](#275))
([29dd16c](29dd16c))
* harden CI/CD pipeline
([#92](#92))
([ce4693c](ce4693c))
* split vulnerability scans into critical-fail and high-warn tiers
([#277](#277))
([aba48af](aba48af))


### Maintenance

* add /worktree skill for parallel worktree management
([#171](#171))
([951e337](951e337))
* add design spec context loading to research-link skill
([8ef9685](8ef9685))
* add post-merge-cleanup skill
([#70](#70))
([f913705](f913705))
* add pre-pr-review skill and update CLAUDE.md
([#103](#103))
([92e9023](92e9023))
* add research-link skill and rename skill files to SKILL.md
([#101](#101))
([651c577](651c577))
* bump aiosqlite from 0.21.0 to 0.22.1
([#191](#191))
([3274a86](3274a86))
* bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group
([#96](#96))
([0338d0c](0338d0c))
* bump ruff from 0.15.4 to 0.15.5
([a49ee46](a49ee46))
* fix M0 audit items
([#66](#66))
([c7724b5](c7724b5))
* **main:** release ai-company 0.1.1
([#282](#282))
([2f4703d](2f4703d))
* pin setup-uv action to full SHA
([#281](#281))
([4448002](4448002))
* post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests,
hookify rules
([#148](#148))
([c57a6a9](c57a6a9))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement user-configurable cost tiers and subscription/quota-aware tracking

2 participants