feat: add configurable cost tiers and subscription/quota-aware tracking (#67) by Aureliolo · Pull Request #185 · Aureliolo/synthorg

Aureliolo · 2026-03-09T11:44:10Z

Summary

Configurable cost tiers (budget/cost_tiers.py): CostTierDefinition model with price ranges, display properties, and classify_model_tier() function. Built-in 4-tier defaults (low/medium/high/premium) with user override/merge via resolve_tiers()
Quota & subscription models (budget/quota.py): SubscriptionConfig, QuotaLimit, QuotaWindow, ProviderCostModel, DegradationConfig, QuotaSnapshot, QuotaCheckResult — frozen Pydantic models for provider subscription/quota configuration
Quota tracking service (budget/quota_tracker.py): QuotaTracker with per-provider, per-window (minute/hour/day/month) request and token tracking, automatic window rotation, async-safe via asyncio.Lock
BudgetEnforcer integration: Pre-flight check_can_execute() now includes provider quota checks; check_quota() public API; QuotaExhaustedError raised on exhaustion
Config schema: ProviderConfig gains subscription + degradation fields; RootConfig gains cost_tiers field
Documentation: Updated DESIGN_SPEC.md (§1.4, §9.2, §10.4, §15.3, §15.5), CLAUDE.md, README.md

Closes #67

Test plan

14 new tests added for review findings (exhaustion reason with estimated tokens, deep copy isolation, multi-window exhaustion, input validation, cross-field validation, graceful degradation)
All 4593 tests pass (6 skipped — symlinks/real LLM)
96.35% coverage (80% minimum)
mypy strict: 0 errors
ruff: all checks passed
Pre-commit hooks: all passed

Review coverage

Pre-reviewed by 9 agents (code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, resilience-audit, docs-consistency). 34 findings addressed, 0 skipped.

…ng (#67) Add two sub-systems for BudgetEnforcer: - Cost tier definitions with configurable metadata, price ranges, and model classification (CostTierDefinition, resolve_tiers, classify_model_tier) - Subscription/quota tracking with per-provider rate/token/request caps, three cost models (per_token, subscription, local), and graceful degradation strategies (QuotaTracker, SubscriptionConfig, DegradationConfig) Integrate quota checks into BudgetEnforcer.check_can_execute() and add cost_tiers to RootConfig, subscription/degradation to ProviderConfig.

Pre-reviewed by 9 agents, 34 findings addressed: - Fix _build_exhaustion_reason ignoring estimated_tokens (bug) - Re-export QuotaExhaustedError from engine/__init__.py - Fix misleading docstrings about degradation strategy - Fix _patch_periods() creating duplicate context managers - Add input validation for negative requests/tokens - Add DEBUG logging for unknown providers in QuotaTracker - Compute window_resets_at in QuotaSnapshot - Add QuotaCheckResult cross-field validation - Add negative cost guard in classify_model_tier - Update DESIGN_SPEC.md, CLAUDE.md, README.md docs - Add 14 new tests for coverage gaps - Code simplification via single-pass validators

github-actions · 2026-03-09T11:44:19Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

coderabbitai · 2026-03-09T11:44:24Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f1d5a084-07cd-47ca-ae07-dceeff5c6c11

📥 Commits

Reviewing files that changed from the base of the PR and between 02baced and 05fbbc8.

📒 Files selected for processing (10)

DESIGN_SPEC.md
src/ai_company/budget/cost_tiers.py
src/ai_company/budget/enforcer.py
src/ai_company/budget/quota.py
src/ai_company/budget/quota_tracker.py
src/ai_company/config/schema.py
tests/unit/budget/test_cost_tiers.py
tests/unit/budget/test_enforcer_quota.py
tests/unit/budget/test_quota.py
tests/unit/budget/test_quota_tracker.py

📝 Walkthrough

Summary by CodeRabbit

New Features
- Configurable cost tiers and model-tier classification.
- Per-provider quota tracking across minute/hour/day/month windows with pre-flight checks.
- Provider subscription and degradation strategies (fallback, queue, alert) added to configuration.
- Budget enforcement now includes quota checks and a new quota-exhausted error.
Observability
- New quota- and budget-related event identifiers for monitoring.
Tests
- Extensive unit tests added for cost tiers, quota models, quota tracker, and enforcer integration.

Walkthrough

Adds configurable cost tiers, provider subscription/quota models, a windowed QuotaTracker service, quota-aware BudgetEnforcer integration (with QuotaExhaustedError), observability events for quota/budget, config/schema extensions, default config key, and comprehensive unit tests for the new budgeting surface.

Changes

Cohort / File(s)	Summary
Cost Tier System `src/ai_company/budget/cost_tiers.py`, `src/ai_company/observability/events/budget.py`	New CostTierDefinition and CostTiersConfig models, BUILTIN_TIERS, resolve/classify functions, and two budget event constants for tier resolution/classify misses.
Quota Domain Models `src/ai_company/budget/quota.py`	New quota/subscription models and enums: `QuotaWindow`, `QuotaLimit`, `ProviderCostModel`, `SubscriptionConfig`, `DegradationConfig`/`DegradationAction`, `QuotaSnapshot`, `QuotaCheckResult`, plus utilities `window_start` and `effective_cost_per_1k`.
Quota Tracking Service `src/ai_company/budget/quota_tracker.py`	New `QuotaTracker` implementing windowed per-provider usage, `record_usage`, `check_quota`, snapshot APIs, window rotation, concurrency protection, and quota-related observability events.
Budget Enforcer Integration `src/ai_company/budget/enforcer.py`, `src/ai_company/engine/errors.py`	`BudgetEnforcer` accepts optional `quota_tracker`, `check_can_execute` accepts `provider_name`/`estimated_tokens`, adds `check_quota` and provider quota check path; introduces `QuotaExhaustedError`.
Public API Exports `src/ai_company/budget/__init__.py`, `src/ai_company/engine/__init__.py`	Re-exports new cost/ quota types and `QuotaTracker`; exposes `QuotaExhaustedError` on engine public API.
Configuration & Defaults `src/ai_company/config/schema.py`, `src/ai_company/config/defaults.py`	Adds `ProviderConfig.subscription` and `.degradation` fields; adds `RootConfig.cost_tiers`; default config gains `cost_tiers` key.
Observability Events `src/ai_company/observability/events/quota.py`	New quota event constants (`QUOTA_TRACKER_CREATED`, `QUOTA_USAGE_RECORDED`, `QUOTA_CHECK_ALLOWED`, `QUOTA_CHECK_DENIED`, `QUOTA_WINDOW_ROTATED`, `QUOTA_SNAPSHOT_QUERIED`, `QUOTA_USAGE_SKIPPED`).
Documentation / Design `CLAUDE.md`, `DESIGN_SPEC.md`, `README.md`	Updated docs/design snapshots and README to reflect configurable cost tiers and quota/subscription tracking in budget enforcement.
Tests & Factories `tests/unit/budget/conftest.py`, `tests/unit/config/conftest.py`, `tests/unit/observability/test_events.py`, `tests/unit/budget/test_cost_tiers.py`, `tests/unit/budget/test_quota.py`, `tests/unit/budget/test_quota_tracker.py`, `tests/unit/budget/test_enforcer_quota.py`	Adds factories for new models, `make_quota_tracker` helper, extends config factories, updates event discovery test, and adds extensive unit tests covering tiers, quota models, tracker behavior, and BudgetEnforcer quota integration.

Sequence Diagram(s)

sequenceDiagram
    participant Agent
    participant BudgetEnforcer as BudgetEnforcer
    participant QuotaTracker
    participant Observability as Observability

    Agent->>BudgetEnforcer: check_can_execute(agent_id, provider_name, estimated_tokens)
    alt quota_tracker configured & provider_name provided
        BudgetEnforcer->>QuotaTracker: check_quota(provider_name, estimated_tokens)
        QuotaTracker->>QuotaTracker: evaluate windows (per_minute/hour/day/month)
        alt allowed
            QuotaTracker->>Observability: emit QUOTA_CHECK_ALLOWED
            QuotaTracker-->>BudgetEnforcer: QuotaCheckResult(allowed=true)
            BudgetEnforcer-->>Agent: proceed
        else denied
            QuotaTracker->>Observability: emit QUOTA_CHECK_DENIED
            QuotaTracker-->>BudgetEnforcer: QuotaCheckResult(allowed=false, reason)
            BudgetEnforcer->>Observability: log QUOTA_CHECK_DENIED
            BudgetEnforcer-->>Agent: raise QuotaExhaustedError
        end
    else no quota_tracker or no provider_name
        BudgetEnforcer->>Observability: debug skip quota check
        BudgetEnforcer-->>Agent: proceed
    end

sequenceDiagram
    participant Engine
    participant QuotaTracker
    participant Provider
    participant Monitor

    Engine->>QuotaTracker: check_quota(provider, estimated_tokens)
    QuotaTracker->>QuotaTracker: get current snapshots
    alt within quota
        QuotaTracker-->>Engine: allowed=true
        Engine->>Provider: perform request
        Provider-->>Engine: response
        Engine->>QuotaTracker: record_usage(provider, requests=1, tokens=N)
        QuotaTracker->>Observability: emit QUOTA_USAGE_RECORDED
    else exhausted
        QuotaTracker-->>Engine: allowed=false (reason)
        Engine->>Observability: log quota denial
    end

    Monitor->>QuotaTracker: time boundary crossed
    QuotaTracker->>QuotaTracker: rotate window counters
    QuotaTracker->>Observability: emit QUOTA_WINDOW_ROTATED

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

Possibly related PRs

feat: add configurable cost tiers and subscription/quota-aware tracking (#67) #185: Appears to add the same budget cost_tiers/quota/quota_tracker modules and tests — likely the exact or strongly overlapping implementation.
refactor: harden BudgetEnforcer with error handling and review fixes #182: Directly related changes to BudgetEnforcer and the budget subsystem; touches enforcement integration and related patterns.
feat: implement YAML config loader with Pydantic validation #75: Related to configuration/schema additions and the YAML/Pydantic config surface that this PR extends (ProviderConfig/RootConfig changes).

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately and concisely summarizes the main change: adding configurable cost tiers and subscription/quota-aware tracking.
Description check	✅ Passed	The description clearly explains the changeset with specific modules, features, and integration points, directly relating to the code changes.
Linked Issues check	✅ Passed	The PR meets all key acceptance criteria from issue `#67`: CostTierDefinition with display properties, built-in 4-tier defaults with user override, SubscriptionConfig and quota models, QuotaTracker with multi-window tracking, BudgetEnforcer integration, and config schema updates.
Out of Scope Changes check	✅ Passed	All code changes directly support the stated objectives: cost tier definitions, subscription/quota models, quota tracking service, BudgetEnforcer integration, config updates, and comprehensive tests.
Docstring Coverage	✅ Passed	Docstring coverage is 96.02% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/cost-tiers

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch feat/cost-tiers

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-09T11:44:39Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the system's ability to manage and enforce costs and resource usage for AI model providers. It introduces a flexible framework for defining and classifying models into configurable cost tiers, alongside a robust quota and subscription tracking mechanism. By integrating these features into the existing budget enforcement, the system can now perform granular pre-flight checks, prevent overages, and lay the groundwork for graceful degradation strategies, thereby improving cost control and operational stability.

Highlights

Configurable Cost Tiers: Introduced CostTierDefinition and CostTiersConfig models, allowing for flexible definition of cost tiers based on price ranges, display properties, and classification functions. Built-in default tiers (low/medium/high/premium) are provided, with support for user overrides and merging.
Quota and Subscription Models: Added new Pydantic models (SubscriptionConfig, QuotaLimit, QuotaWindow, ProviderCostModel, DegradationConfig, QuotaSnapshot, QuotaCheckResult) to define provider subscription plans, usage limits across various time windows, and strategies for graceful degradation when quotas are approached.
Quota Tracking Service: Implemented a QuotaTracker service responsible for monitoring per-provider, per-window request and token usage. This service handles automatic window rotation and ensures concurrency safety using asyncio.Lock.
BudgetEnforcer Integration: The existing BudgetEnforcer was enhanced to incorporate provider quota checks into its pre-flight check_can_execute() method. A new public check_quota() API was added, and a QuotaExhaustedError is now raised when a provider's quota is depleted.
Configuration Schema Updates: The main configuration schema (RootConfig) now includes a cost_tiers field for defining custom cost tiers. Additionally, ProviderConfig now supports subscription and degradation fields to configure provider-specific quota limits and degradation behaviors.
Documentation Updates: Key documentation files (DESIGN_SPEC.md, CLAUDE.md, README.md) were updated to reflect the new cost tier and quota tracking functionalities, providing clearer guidance on their implementation and usage.

Changelog

CLAUDE.md
- Updated the description of the budget/ directory to include cost tiers and quota/subscription tracking.
DESIGN_SPEC.md
- Updated the M5 implementation snapshot to reflect the completion of budget enforcement, including configurable cost tiers and quota/subscription tracking.
- Added an implementation note for M5 detailing the new subscription and degradation fields in ProviderConfig.
- Updated the BudgetEnforcer implementation note to include QuotaTracker and ModelResolver in its composition and to mention provider quota enforcement in pre-flight checks.
- Added a new entry to the conventions table for "Cost tiers & quota tracking" with details on their implementation and benefits.
README.md
- Updated the "Budget Enforcement (M5)" entry to explicitly mention configurable cost tiers and quota/subscription tracking.
src/ai_company/budget/init.py
- Added imports for CostTierDefinition, CostTiersConfig, classify_model_tier, resolve_tiers, QuotaLimit, QuotaWindow, SubscriptionConfig, DegradationAction, DegradationConfig, ProviderCostModel, QuotaSnapshot, QuotaCheckResult, QuotaTracker, and effective_cost_per_1k.
- Exported new cost tier and quota-related symbols in __all__.
src/ai_company/budget/cost_tiers.py
- Added a new file defining CostTierDefinition (metadata for a single tier), CostTiersConfig (configuration for tiers), BUILTIN_TIERS (default tier definitions), resolve_tiers (merges built-in and user-defined tiers), and classify_model_tier (classifies model cost into a tier).
src/ai_company/budget/enforcer.py
- Imported QuotaCheckResult and QuotaExhaustedError.
- Imported new observability events for quota checks.
- Modified the BudgetEnforcer constructor to accept an optional QuotaTracker instance.
- Updated check_can_execute to include an optional provider_name argument and perform provider quota checks if a QuotaTracker is configured.
- Added a new public asynchronous method check_quota to delegate quota checks to the QuotaTracker.
- Added a private asynchronous method _check_provider_quota to handle raising QuotaExhaustedError.
- Added a helper function _always_allowed_result for QuotaCheckResult.
src/ai_company/budget/quota.py
- Added a new file defining QuotaWindow (time window enum), QuotaLimit (single quota limit), ProviderCostModel (how a provider charges), SubscriptionConfig (subscription and quota configuration), DegradationAction (action on quota exhaustion), DegradationConfig (degradation strategy configuration), QuotaSnapshot (point-in-time usage snapshot), and QuotaCheckResult (result of a quota check).
- Included utility functions window_start (computes UTC-aware start of a window) and effective_cost_per_1k (computes cost based on provider model).
src/ai_company/budget/quota_tracker.py
- Added a new file implementing the QuotaTracker service, which tracks per-provider request and token usage across configured time windows.
- Implemented record_usage to update counters and handle window rotation.
- Implemented check_quota for pre-flight checks against limits.
- Implemented get_snapshot and get_all_snapshots to retrieve current usage data.
- Included internal helper functions for exhaustion checks and reason building.
src/ai_company/config/defaults.py
- Added cost_tiers to the default configuration dictionary.
src/ai_company/config/schema.py
- Imported CostTiersConfig, DegradationConfig, and SubscriptionConfig.
- Updated ProviderConfig to include subscription and degradation fields.
- Updated RootConfig to include a cost_tiers field.
src/ai_company/engine/init.py
- Exported QuotaExhaustedError.
src/ai_company/engine/errors.py
- Defined QuotaExhaustedError as a new exception class, subclassing BudgetExhaustedError.
src/ai_company/observability/events/budget.py
- Added new event constants BUDGET_TIER_RESOLVED and BUDGET_TIER_CLASSIFY_MISS.
src/ai_company/observability/events/quota.py
- Added a new file defining event constants for quota tracking, such as QUOTA_TRACKER_CREATED, QUOTA_USAGE_RECORDED, QUOTA_CHECK_ALLOWED, QUOTA_CHECK_DENIED, QUOTA_WINDOW_ROTATED, QUOTA_SNAPSHOT_QUERIED, and QUOTA_USAGE_SKIPPED.
tests/unit/budget/conftest.py
- Added CostTierDefinitionFactory, CostTiersConfigFactory, QuotaLimitFactory, and SubscriptionConfigFactory for testing.
- Added a make_quota_tracker helper function for creating QuotaTracker instances in tests.
tests/unit/budget/test_cost_tiers.py
- Added a new file with unit tests for CostTierDefinition model validation, CostTiersConfig behavior, BUILTIN_TIERS properties, resolve_tiers functionality (merging and overrides), and classify_model_tier logic (classification boundaries).
tests/unit/budget/test_enforcer_quota.py
- Added a new file with unit tests for BudgetEnforcer's integration with QuotaTracker, covering scenarios where quota is allowed, exhausted, skipped, and handling of unexpected errors.
tests/unit/budget/test_quota.py
- Added a new file with comprehensive unit tests for all new quota and subscription domain models: QuotaWindow, QuotaLimit, ProviderCostModel, SubscriptionConfig, DegradationAction, DegradationConfig, QuotaSnapshot, and QuotaCheckResult.
- Tested utility functions window_start and effective_cost_per_1k.
- Included tests for validation rules, computed properties, and edge cases like negative inputs or over-limit usage.
tests/unit/budget/test_quota_tracker.py
- Added a new file with unit tests for the QuotaTracker service, covering its construction, record_usage (including accumulation and window rotation), check_quota (including estimated tokens and multiple windows), get_snapshot, and get_all_snapshots.
- Included tests for deep copy isolation and input validation.
tests/unit/config/conftest.py
- Updated ProviderConfigFactory to include default SubscriptionConfig and DegradationConfig.
- Updated RootConfigFactory to include default CostTiersConfig.
tests/unit/observability/test_events.py
- Added 'quota' to the list of domain modules checked for event constant discovery.

Activity

14 new tests were added to cover review findings, including exhaustion reasons with estimated tokens, deep copy isolation, multi-window exhaustion, input validation, cross-field validation, and graceful degradation.
All 4593 existing tests passed, with 6 tests skipped (related to symlinks/real LLM interactions).
Code coverage stands at 96.35%, exceeding the 80% minimum requirement.
Mypy strict type checking reported 0 errors.
Ruff linter checks passed without issues.
All pre-commit hooks passed successfully.
The pull request underwent pre-review by 9 automated agents (code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, resilience-audit, docs-consistency), leading to 34 addressed findings and 0 skipped.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

greptile-apps · 2026-03-09T11:50:07Z

Greptile Summary

This PR completes the M5 budget enforcement layer by adding three tightly-integrated components: configurable cost tiers (cost_tiers.py), a comprehensive quota/subscription model layer (quota.py), and an async-safe per-provider quota tracking service (quota_tracker.py). BudgetEnforcer is extended with an optional QuotaTracker injection point, and QuotaExhaustedError (a BudgetExhaustedError subclass) is wired into the existing pre-flight exception hierarchy. The implementation is well-tested (96% coverage, mypy strict, ruff clean) and unimplemented FALLBACK/QUEUE strategies are correctly documented in both code and the design spec.

Key findings:

Logic — snapshot / enforcement token-boundary divergence (quota.py:278): QuotaSnapshot.is_exhausted uses >= for tokens (at-limit counts as exhausted) while _is_window_exhausted uses strict > (exact-fill is permitted). When all tokens are consumed and a check is made with zero estimated projection, the snapshot signals exhaustion but check_quota returns allowed. Since check_can_execute defaults the projection to zero, this discrepancy is always present in the common code path.
Style — QUOTA_WINDOW_ROTATED logged at DEBUG (quota_tracker.py:162): CLAUDE.md requires all state transitions to log at INFO. A window rotation resets quota counters and directly affects subsequent enforcement decisions; it warrants INFO-level visibility.
Style — effective_cost_per_1k accepts negative cost components silently (quota.py:377): No validation or warning is emitted when either cost argument is negative, producing a semantically invalid result that classifies silently without surfacing the anomaly.

Confidence Score: 3/5

Mostly safe to merge; one behavioral inconsistency between the snapshot API and enforcement predicate warrants a fix before shipping the snapshot as a public observability surface.
The PR is well-engineered overall — strong test coverage, mypy strict, correct exception hierarchy, acknowledged TOCTOU, and documented unimplemented strategies. The score is held at 3 due to the logic inconsistency between QuotaSnapshot.is_exhausted and _is_window_exhausted for the token-at-limit boundary: the two APIs give opposite answers for the same state, which will mislead any operator or downstream consumer that relies on the snapshot to predict enforcement behavior. The two style issues (log level on window rotation, no negative-cost guard in effective_cost_per_1k) are low-risk but worth addressing per project conventions.
Pay close attention to src/ai_company/budget/quota.py — specifically the QuotaSnapshot.is_exhausted property (line 278) and the effective_cost_per_1k helper (line 377).

Important Files Changed

Filename	Overview
src/ai_company/budget/quota.py	New file: quota and subscription domain models. `QuotaSnapshot.is_exhausted` uses `>=` for token exhaustion, but the enforcement predicate `_is_window_exhausted` uses `>` (exact-fill permitted), causing a behavioral divergence between the snapshot API and actual enforcement. `effective_cost_per_1k` also accepts negative cost components without validation or logging.
src/ai_company/budget/quota_tracker.py	New file: async-safe per-provider quota tracking with window rotation. Logic is correct and well-tested. `QUOTA_WINDOW_ROTATED` is emitted at DEBUG level; per CLAUDE.md, state transitions should be logged at INFO. TOCTOU gap between check and record is acknowledged in the class docstring.
src/ai_company/budget/enforcer.py	Extended with optional `QuotaTracker` injection and `check_quota` / `_check_provider_quota` methods. `QuotaExhaustedError` (a `BudgetExhaustedError` subclass) is correctly re-raised by the `except BudgetExhaustedError: raise` guard. The public `check_quota` method correctly delegates and documents its different error-handling contract vs `check_can_execute`.
src/ai_company/budget/cost_tiers.py	New file: configurable cost tier definitions with merge/override semantics. Model validators, boundary classification, and logging are all well-structured. No issues found.
src/ai_company/config/schema.py	Adds `subscription`/`degradation` fields to `ProviderConfig` and `cost_tiers` to `RootConfig`. Cross-field validator correctly checks that `fallback_providers` reference known providers. Clean integration.
src/ai_company/engine/errors.py	Adds `QuotaExhaustedError` as a subclass of `BudgetExhaustedError`. Docstring correctly notes that FALLBACK/QUEUE degradation routing is unimplemented. Correct hierarchy placement.
tests/unit/budget/test_quota_tracker.py	Comprehensive tests covering construction, record_usage, check_quota, snapshots, window rotation (with time mocking), deep-copy isolation, multi-window exhaustion, and projected-token denial. No issues found.
tests/unit/budget/test_enforcer_quota.py	Tests for BudgetEnforcer quota integration including graceful degradation on unexpected exceptions. Well-structured with clear helpers and deterministic time patching. No issues found.
src/ai_company/observability/events/quota.py	New file: quota event constants. All 7 event constants are well-named and follow established conventions.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant BudgetEnforcer
    participant QuotaTracker
    participant QuotaSnapshot

    Caller->>BudgetEnforcer: check_can_execute(agent_id, provider_name, estimated_tokens)
    BudgetEnforcer->>BudgetEnforcer: _check_monthly_hard_stop()
    BudgetEnforcer->>BudgetEnforcer: _check_daily_limit()
    BudgetEnforcer->>BudgetEnforcer: _check_provider_quota(agent_id, provider_name, estimated_tokens)
    BudgetEnforcer->>QuotaTracker: check_quota(provider_name, estimated_tokens)
    QuotaTracker->>QuotaTracker: _is_window_exhausted(usage, quota, estimated_tokens)
    alt quota exhausted
        QuotaTracker-->>BudgetEnforcer: QuotaCheckResult(allowed=False, reason=..., exhausted_windows=...)
        BudgetEnforcer-->>Caller: raise QuotaExhaustedError
    else quota OK
        QuotaTracker-->>BudgetEnforcer: QuotaCheckResult(allowed=True)
        BudgetEnforcer-->>Caller: return (execution allowed)
    end

    Caller->>QuotaTracker: record_usage(provider_name, requests, tokens)
    QuotaTracker->>QuotaTracker: rotate window if boundary crossed
    QuotaTracker->>QuotaTracker: accumulate counters

    Caller->>QuotaTracker: get_snapshot(provider_name)
    QuotaTracker-->>QuotaSnapshot: build QuotaSnapshot(requests_used, tokens_used, ...)
    QuotaSnapshot-->>Caller: snapshot with is_exhausted, requests_remaining, tokens_remaining

_{Last reviewed commit: 05fbbc8}

src/ai_company/budget/quota.py

src/ai_company/budget/enforcer.py

Copilot

Pull request overview

Adds configurable cost-tier definitions and provider subscription/quota awareness to the budget enforcement layer, introducing new budget-domain models/services and wiring them into config + observability.

Changes:

Introduces configurable cost tiers (CostTierDefinition, CostTiersConfig, resolve_tiers(), classify_model_tier()) and exposes them via config and observability events.
Adds subscription/quota domain models (SubscriptionConfig, QuotaLimit, QuotaSnapshot, etc.) and an async-safe QuotaTracker for per-provider, per-window usage tracking.
Integrates quota checks into BudgetEnforcer pre-flight execution checks and adds associated errors/events/docs/tests.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/unit/observability/test_events.py	Updates domain-module discovery expectations to include new `quota` events module.
tests/unit/config/conftest.py	Extends config factories to include `cost_tiers`, `subscription`, and `degradation` defaults.
tests/unit/budget/test_quota_tracker.py	Adds unit tests covering `QuotaTracker` behavior (recording, rotation, snapshots, validation).
tests/unit/budget/test_quota.py	Adds unit tests for quota/subscription models and helper functions.
tests/unit/budget/test_enforcer_quota.py	Adds unit tests validating `BudgetEnforcer` quota integration and error behavior.
tests/unit/budget/test_cost_tiers.py	Adds unit tests for cost tier definitions, merging, and classification boundaries.
tests/unit/budget/conftest.py	Adds budget test factories/fixtures for cost tiers and quota tracking.
src/ai_company/observability/events/quota.py	Adds quota-related observability event constants.
src/ai_company/observability/events/budget.py	Adds budget event constants for tier resolution and classification misses.
src/ai_company/engine/errors.py	Introduces `QuotaExhaustedError` for provider quota exhaustion.
src/ai_company/engine/init.py	Re-exports `QuotaExhaustedError` from the engine package.
src/ai_company/config/schema.py	Extends provider/root config with `subscription`, `degradation`, and `cost_tiers` fields.
src/ai_company/config/defaults.py	Adds default `cost_tiers` stanza to the default config dict.
src/ai_company/budget/quota_tracker.py	Implements `QuotaTracker` service with window rotation + snapshots + logging.
src/ai_company/budget/quota.py	Adds quota/subscription/degradation models and helpers (`window_start`, `effective_cost_per_1k`).
src/ai_company/budget/enforcer.py	Wires quota checks into `BudgetEnforcer` and adds `check_quota()` API.
src/ai_company/budget/cost_tiers.py	Adds cost tier models, built-in tier set, merge logic, and classifier.
src/ai_company/budget/init.py	Re-exports new budget APIs (tiers, quota models, tracker).
README.md	Updates milestone feature list to mention tiers + quota/subscription tracking.
DESIGN_SPEC.md	Documents newly implemented budget/quota/tier capabilities and event/module structure.
CLAUDE.md	Updates repo layout description to include tiers and quota/subscription tracking under budget.

Comments suppressed due to low confidence (1)

src/ai_company/budget/enforcer.py:127

except MemoryError, RecursionError: is invalid syntax in Python 3 and will prevent this module from importing. Use tuple exception syntax (and optionally bind the exception) instead, e.g. except (MemoryError, RecursionError): / as exc.

        except BudgetExhaustedError:
            raise
        except MemoryError, RecursionError:  # builtin MemoryError (OOM)
            raise

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-09T11:51:03Z

src/ai_company/budget/quota_tracker.py

+        if provider_name not in self._usage:
+            logger.debug(
+                QUOTA_USAGE_SKIPPED,
+                provider=provider_name,
+                reason="no_subscription_config",
+            )
+            return


QuotaTracker treats providers with a SubscriptionConfig but no quotas the same as unknown providers (provider_name not in self._usage) and logs reason="no_subscription_config". This is misleading in observability; consider distinguishing "unknown provider" from "no quotas configured" (or key _usage off subscriptions and check sub_config.quotas separately).

Copilot · 2026-03-09T11:51:03Z

src/ai_company/budget/cost_tiers.py

+Provides configurable metadata for cost tiers: price ranges, display
+properties, and model-to-tier classification.  The built-in ``CostTier``
+enum (``core.enums``) remains for backward compatibility; this module
+adds a configurable layer on top.


The module docstring references the built-in CostTier enum as (core.enums), but in this repo the enum lives in ai_company.core.enums. Updating the reference avoids pointing readers to a non-existent module path.

Copilot · 2026-03-09T11:51:04Z

tests/unit/budget/test_quota.py

+    def test_defaults_to_now(self) -> None:
+        """Uses current time when now is not provided."""
+        result = window_start(QuotaWindow.PER_DAY)
+        now = datetime.now(UTC)
+        assert result.day == now.day


This test can be flaky around UTC midnight because it calls window_start() and then separately calls datetime.now(UTC) for comparison; the day could roll over between the two calls. Prefer passing a fixed now= into window_start() (or freezing time) and asserting the full expected value.

Copilot · 2026-03-09T11:51:04Z

DESIGN_SPEC.md

 | **Personality compatibility scoring** | Adopted (M3) | Weighted composite: 60% Big Five similarity (openness, conscientiousness, agreeableness, stress_response → 1−\|diff\|; extraversion → tent-function peaking at 0.3 diff), 20% collaboration alignment (ordinal adjacency: INDEPENDENT↔PAIR↔TEAM), 20% conflict approach (constructive pairs score 1.0, destructive pairs 0.2, mixed 0.4–0.6). `itertools.combinations` for team-level averaging. Result clamped to [0, 1]. | Covers behavioral diversity (extraversion complement), task alignment (conscientiousness similarity), and interpersonal friction (conflict approach). Weights are configurable module constants. |
 | **Agent behavior testing** | Planned (M3) | Scripted `FakeProvider` for unit tests (deterministic turn sequences); behavioral outcome assertions for integration tests (task completed, tools called, cost within budget). | Leverages existing `FakeProvider` and `CompletionResponseFactory` fixtures. Precise engine testing without brittle response-matching at integration level. |
 | **LLM call analytics** | Adopted (incremental) | M3: proxy metrics (`turns_per_task`, `tokens_per_task`) — adopted. M4 data models: call categorization (`productive`, `coordination`, `system`), category analytics, coordination metrics, orchestration ratio — adopted. M4 runtime collection pipeline and M5+ full analytics: planned. | Append-only, never blocks execution. Builds on existing `CostRecord` infrastructure. Detects orchestration overhead early. See §10.5. |
+| **Cost tiers & quota tracking** | Adopted (M5) | Configurable `CostTier` definitions with merge/override semantics via `resolve_tiers(defaults, overrides)`. `SubscriptionConfig` + `QuotaLimit` model per-provider subscription plans. `QuotaTracker` enforces per-provider request/token quotas with window-based rotation. `DegradationConfig` controls behavior when quotas are approached. | Enables cost classification without hardcoding vendor tiers. Quota tracking prevents surprise overages at the provider level. Window-based rotation aligns quota resets with billing periods. See §10.4. |


DESIGN_SPEC says cost tiers are merged via resolve_tiers(defaults, overrides), but the implementation added in this PR exposes resolve_tiers(config: CostTiersConfig) instead. Update the spec to match the actual API so readers don’t implement against the wrong signature.

gemini-code-assist

Code Review

This pull request introduces a significant and well-structured set of features for budget management, including configurable cost tiers, provider-level quota tracking, and integration with the existing BudgetEnforcer. The new Pydantic models for quotas and cost tiers are robust, and the QuotaTracker service is well-implemented with attention to concurrency safety. The accompanying tests are comprehensive. I have identified one critical syntax error and one high-severity concern regarding the fail-open behavior of the budget check.

gemini-code-assist · 2026-03-09T11:54:25Z

src/ai_company/budget/enforcer.py

+                await self._check_provider_quota(agent_id, provider_name)
        except BudgetExhaustedError:
            raise
        except MemoryError, RecursionError:  # builtin MemoryError (OOM)


This except statement uses Python 2 syntax (except MemoryError, RecursionError:), which is a SyntaxError in Python 3. To catch multiple exceptions, they should be enclosed in a tuple.

except (MemoryError, RecursionError): # builtin MemoryError (OOM)

coderabbitai

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/ai_company/budget/enforcer.py (1)
117-134: ⚠️ Potential issue | 🟠 Major

Don't fail open when the quota subsystem itself errors.

This broad fallback now includes _check_provider_quota(). Before quota integration, the allow-on-error path only weakened spend preflight; with quotas inside the same try, a tracker/config bug silently disables provider-cap enforcement altogether.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai_company/budget/enforcer.py` around lines 117 - 134, The current
try/except wraps _check_monthly_hard_stop, _check_daily_limit, and
_check_provider_quota so any exception in _check_provider_quota silently falls
back to allow execution; move _check_provider_quota out of the broad try (or
give it its own tight try that only catches BudgetExhaustedError and
MemoryError/RecursionError) so provider quota subsystem errors are not
swallowed—adjust calls to _check_monthly_hard_stop, _check_daily_limit, and
_check_provider_quota accordingly to ensure only the preflight spend checks may
fall back, while quota errors propagate.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@DESIGN_SPEC.md`:
- Line 1697: The spec overstates runtime behavior for
ProviderConfig.degradation: either narrow DegradationConfig documentation to
reflect current behavior (that QuotaExhaustedError still always raises and
FALLBACK/QUEUE routing is not implemented) and update the QuotaExhaustedError
docstring to explicitly state the limitation, or implement the missing runtime
behaviors (FALLBACK/QUEUE auto-downgrade or rejection) to match the spec; in
either case update DESIGN_SPEC.md and the QuotaExhaustedError docstring
consistently and add an explicit note (per coding guidelines) that alerts users
when the implementation deviates from the spec and why.

In `@src/ai_company/budget/enforcer.py`:
- Around line 94-99: check_can_execute currently performs a zero-token quota
check because it calls _check_provider_quota()/check_quota() without passing
through an estimated token count; thread an estimated_tokens argument from
check_can_execute into any calls to _check_provider_quota and ultimately to the
quota check method (check_quota) so the preflight uses the projected usage, and
update the other call sites referenced (the blocks around the other occurrences
of _check_provider_quota / check_quota noted at the other ranges) to pass the
same estimated_tokens value to maintain consistent token-based caps enforcement.

In `@src/ai_company/budget/quota_tracker.py`:
- Around line 88-157: check_quota() and record_usage() have a TOCTOU gap: under
concurrency multiple coroutines can each pass check_quota() then race to
record_usage(), oversubscribing limits. Fix by introducing an atomic admission
method (e.g. try_consume or reserve_and_consume) that acquires self._lock,
computes remaining capacity using the same window logic (window_start via
window_start(...) and current from self._usage[provider_name]), and if capacity
suffices increments the counters and returns success; update callers to use this
new atomic method instead of calling check_quota() then record_usage(), and
leave check_quota() as a pure read-only inspector (or remove it) so all
admission decisions are made via the single locked path in the new method; also
adjust record_usage() so it’s either private for non-admission bookkeeping or
folded into the atomic consume implementation.
- Around line 345-370: The token-path currently treats a request that exactly
reaches max_tokens as exhausted because both _is_window_exhausted and
_build_exhaustion_reason use "projected >= quota.max_tokens"; change those
checks to use a strict ">" comparison (i.e., projected > quota.max_tokens) so a
request that exactly fills the remaining tokens is allowed, and only requests
that would exceed the token limit are denied; update the comparisons in both
_is_window_exhausted (the return expression) and _build_exhaustion_reason (the
token append condition that uses the local projected variable) to reflect this
strict comparison.

In `@src/ai_company/budget/quota.py`:
- Around line 305-347: The window_start function currently reads calendar fields
from the provided now datetime and then tags the result as UTC, which yields
wrong boundaries for non-UTC-aware datetimes; update window_start (and
references to QuotaWindow.PER_MINUTE / PER_HOUR / PER_DAY) to first reject naive
datetimes (raise a ValueError if now.tzinfo is None or now.tzinfo.utcoffset(now)
is None) and then normalize/convert the timestamp to UTC via now =
now.astimezone(UTC) before extracting year/month/day/hour/minute to build the
window-start datetime.

In `@src/ai_company/config/schema.py`:
- Around line 202-209: RootConfig currently accepts
degradation.fallback_providers (a tuple of strings) without verifying they
reference valid provider names, so typos pass config load; update RootConfig
validation to cross-reference each entry in degradation.fallback_providers
against the canonical provider registry/list (the same source used for routing
model refs) and raise a validation error for any unknown provider names.
Implement this as a pydantic validator/root_validator inside RootConfig (or a
dedicated validate_fallback_providers method) that iterates over
RootConfig.degradation.fallback_providers, checks membership in the provider
registry, and produces a clear ValidationError listing invalid names so config
load fails fast.
- Around line 206-209: The default_factory currently builds a DegradationConfig
(which defaults to FALLBACK and triggers a validator warning when
fallback_providers is empty), causing CONFIG_VALIDATION_FAILED noise; fix by
making the default a no-op degradation config instead of the FALLBACK
default—either change the Field to use default=None with
Optional[DegradationConfig] or set default_factory to create an explicit
non-degrading config (e.g. default_factory=lambda:
DegradationConfig(mode=DegradationMode.NONE)); update the degradation Field
declaration and imports to reference DegradationConfig and DegradationMode
accordingly so normal ProviderConfig parsing does not emit warnings when the
user did not opt into degradation.

In `@src/ai_company/engine/errors.py`:
- Around line 41-46: The current QuotaExhaustedError class and call sites make
every quota miss terminal; update the logic that raises QuotaExhaustedError to
first inspect DegradationConfig.action (the degradation routing policy) and only
raise a terminal QuotaExhaustedError when the configured action mandates
termination; for FALLBACK or QUEUE actions, either return a non-terminal
signal/result or raise a different non-terminal exception type so the caller can
perform fallback/queue behavior; ensure references to QuotaExhaustedError and
BudgetExhaustedError remain consistent and update any docs/comments to reflect
that raising QuotaExhaustedError now depends on DegradationConfig.action.

In `@tests/unit/budget/test_cost_tiers.py`:
- Around line 64-72: The test allows zero-width tiers but the classifier uses
the half-open rule (min <= cost < max) so price_range_min == price_range_max
will never match; fix by either rejecting equal bounds during CostTierDefinition
validation (add a check in CostTierDefinition.__post_init__ or the existing
validate_tiers function to raise for price_range_min >= price_range_max) or
change the classifier comparison in the function that does the tier matching
(where it currently uses "min <= cost < max") to make the upper bound inclusive
for the final/only tier (use "min <= cost <= max" for that case or otherwise
ensure a deterministic inclusive rule). Ensure you reference CostTierDefinition
and the classifier function/method in cost_tiers.py when applying the change.
- Around line 302-349: Collapse the duplicated boundary tests for
classify_model_tier into a single `@pytest.mark.parametrize` table (reusing the
same parameterization you added around lines 368-400), replacing the separate
functions test_boundary_low_medium, test_boundary_medium_high,
test_boundary_high_premium and the within-range tests with parametrized cases
that include input cost and expected tier; remove the duplicated standalone
tests (the ones at lines 302-349) and ensure the parameter table also adds a
missing negative-cost regression case (e.g., cost = -0.001 -> expected "low") so
all boundaries and the negative-cost scenario are covered by the single
parametrized test for classify_model_tier.

In `@tests/unit/budget/test_enforcer_quota.py`:
- Around line 45-58: The tests flake because _make_quota_tracker() hardcodes
QuotaWindow.PER_MINUTE causing counters to rotate if the minute rolls over
between operations; update the helper to avoid minute windows by either (a)
change the default window in _make_quota_tracker to a longer-lived window like
QuotaWindow.PER_HOUR or (b) add a window parameter to _make_quota_tracker and
use that in the SubscriptionConfig so tests can pass an hour-long window or a
frozen clock; locate and modify the _make_quota_tracker function and any test
usages (tests around lines 104-132 and 241-259) to use the new default or pass
an explicit longer window, or alternatively freeze QuotaTracker's clock when
constructing it to ensure deterministic behavior.

In `@tests/unit/budget/test_quota_tracker.py`:
- Around line 39-40: The tests are flaky because they rely on real clock
rollovers for PER_MINUTE/PER_DAY windows; update the test helpers and affected
tests to use a longer, stable window (e.g., QuotaWindow.PER_HOUR) so counters
won't reset between immediate calls. Concretely, change the _minute_quota
factory (and any analogous _day_quota helpers) to return
QuotaLimit(window=QuotaWindow.PER_HOUR, ...) and update tests referenced in the
ranges (39-40, 81-109, 163-223, 276-333, 406-430) that call record_usage(),
check_quota(), or snapshot immediately to use those PER_HOUR helpers (leaving
explicit rotation/rollover tests using PER_HOUR as-is) so assertions become
deterministic.

In `@tests/unit/budget/test_quota.py`:
- Around line 431-435: The test test_defaults_to_now calls datetime.now(UTC)
separately from window_start(QuotaWindow.PER_DAY), which can flake at UTC
midnight; fix by capturing the current time once and using it for the assertion:
either call now = datetime.now(UTC) before invoking window_start and pass now
into window_start if it accepts a now parameter, or if window_start has no now
param, obtain before = datetime.now(UTC); result =
window_start(QuotaWindow.PER_DAY); after = datetime.now(UTC) and assert that
result.day is either before.day or after.day to tolerate the boundary. Ensure
references to test_defaults_to_now, window_start, QuotaWindow.PER_DAY, and
datetime.now(UTC) are used to locate the change.

---

Outside diff comments:
In `@src/ai_company/budget/enforcer.py`:
- Around line 117-134: The current try/except wraps _check_monthly_hard_stop,
_check_daily_limit, and _check_provider_quota so any exception in
_check_provider_quota silently falls back to allow execution; move
_check_provider_quota out of the broad try (or give it its own tight try that
only catches BudgetExhaustedError and MemoryError/RecursionError) so provider
quota subsystem errors are not swallowed—adjust calls to
_check_monthly_hard_stop, _check_daily_limit, and _check_provider_quota
accordingly to ensure only the preflight spend checks may fall back, while quota
errors propagate.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e191842a-93c9-427a-b6b7-9ad62d415e2e

📥 Commits

Reviewing files that changed from the base of the PR and between 26b3108 and 02baced.

📒 Files selected for processing (21)

CLAUDE.md
DESIGN_SPEC.md
README.md
src/ai_company/budget/__init__.py
src/ai_company/budget/cost_tiers.py
src/ai_company/budget/enforcer.py
src/ai_company/budget/quota.py
src/ai_company/budget/quota_tracker.py
src/ai_company/config/defaults.py
src/ai_company/config/schema.py
src/ai_company/engine/__init__.py
src/ai_company/engine/errors.py
src/ai_company/observability/events/budget.py
src/ai_company/observability/events/quota.py
tests/unit/budget/conftest.py
tests/unit/budget/test_cost_tiers.py
tests/unit/budget/test_enforcer_quota.py
tests/unit/budget/test_quota.py
tests/unit/budget/test_quota_tracker.py
tests/unit/config/conftest.py
tests/unit/observability/test_events.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Agent
GitHub Check: Greptile Review

🧰 Additional context used

📓 Path-based instructions (5)

!(DESIGN_SPEC.md|.claude/**|**/litellm/**)