feat: add configurable cost tiers and subscription/quota-aware tracking (#67)#185
feat: add configurable cost tiers and subscription/quota-aware tracking (#67)#185
Conversation
…ng (#67) Add two sub-systems for BudgetEnforcer: - Cost tier definitions with configurable metadata, price ranges, and model classification (CostTierDefinition, resolve_tiers, classify_model_tier) - Subscription/quota tracking with per-provider rate/token/request caps, three cost models (per_token, subscription, local), and graceful degradation strategies (QuotaTracker, SubscriptionConfig, DegradationConfig) Integrate quota checks into BudgetEnforcer.check_can_execute() and add cost_tiers to RootConfig, subscription/degradation to ProviderConfig.
Pre-reviewed by 9 agents, 34 findings addressed: - Fix _build_exhaustion_reason ignoring estimated_tokens (bug) - Re-export QuotaExhaustedError from engine/__init__.py - Fix misleading docstrings about degradation strategy - Fix _patch_periods() creating duplicate context managers - Add input validation for negative requests/tokens - Add DEBUG logging for unknown providers in QuotaTracker - Compute window_resets_at in QuotaSnapshot - Add QuotaCheckResult cross-field validation - Add negative cost guard in classify_model_tier - Update DESIGN_SPEC.md, CLAUDE.md, README.md docs - Add 14 new tests for coverage gaps - Code simplification via single-pass validators
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (10)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds configurable cost tiers, provider subscription/quota models, a windowed QuotaTracker service, quota-aware BudgetEnforcer integration (with QuotaExhaustedError), observability events for quota/budget, config/schema extensions, default config key, and comprehensive unit tests for the new budgeting surface. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent
participant BudgetEnforcer as BudgetEnforcer
participant QuotaTracker
participant Observability as Observability
Agent->>BudgetEnforcer: check_can_execute(agent_id, provider_name, estimated_tokens)
alt quota_tracker configured & provider_name provided
BudgetEnforcer->>QuotaTracker: check_quota(provider_name, estimated_tokens)
QuotaTracker->>QuotaTracker: evaluate windows (per_minute/hour/day/month)
alt allowed
QuotaTracker->>Observability: emit QUOTA_CHECK_ALLOWED
QuotaTracker-->>BudgetEnforcer: QuotaCheckResult(allowed=true)
BudgetEnforcer-->>Agent: proceed
else denied
QuotaTracker->>Observability: emit QUOTA_CHECK_DENIED
QuotaTracker-->>BudgetEnforcer: QuotaCheckResult(allowed=false, reason)
BudgetEnforcer->>Observability: log QUOTA_CHECK_DENIED
BudgetEnforcer-->>Agent: raise QuotaExhaustedError
end
else no quota_tracker or no provider_name
BudgetEnforcer->>Observability: debug skip quota check
BudgetEnforcer-->>Agent: proceed
end
sequenceDiagram
participant Engine
participant QuotaTracker
participant Provider
participant Monitor
Engine->>QuotaTracker: check_quota(provider, estimated_tokens)
QuotaTracker->>QuotaTracker: get current snapshots
alt within quota
QuotaTracker-->>Engine: allowed=true
Engine->>Provider: perform request
Provider-->>Engine: response
Engine->>QuotaTracker: record_usage(provider, requests=1, tokens=N)
QuotaTracker->>Observability: emit QUOTA_USAGE_RECORDED
else exhausted
QuotaTracker-->>Engine: allowed=false (reason)
Engine->>Observability: log quota denial
end
Monitor->>QuotaTracker: time boundary crossed
QuotaTracker->>QuotaTracker: rotate window counters
QuotaTracker->>Observability: emit QUOTA_WINDOW_ROTATED
Estimated code review effort🎯 4 (Complex) | ⏱️ ~70 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the system's ability to manage and enforce costs and resource usage for AI model providers. It introduces a flexible framework for defining and classifying models into configurable cost tiers, alongside a robust quota and subscription tracking mechanism. By integrating these features into the existing budget enforcement, the system can now perform granular pre-flight checks, prevent overages, and lay the groundwork for graceful degradation strategies, thereby improving cost control and operational stability. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Greptile SummaryThis PR completes the M5 budget enforcement layer by adding three tightly-integrated components: configurable cost tiers ( Key findings:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller
participant BudgetEnforcer
participant QuotaTracker
participant QuotaSnapshot
Caller->>BudgetEnforcer: check_can_execute(agent_id, provider_name, estimated_tokens)
BudgetEnforcer->>BudgetEnforcer: _check_monthly_hard_stop()
BudgetEnforcer->>BudgetEnforcer: _check_daily_limit()
BudgetEnforcer->>BudgetEnforcer: _check_provider_quota(agent_id, provider_name, estimated_tokens)
BudgetEnforcer->>QuotaTracker: check_quota(provider_name, estimated_tokens)
QuotaTracker->>QuotaTracker: _is_window_exhausted(usage, quota, estimated_tokens)
alt quota exhausted
QuotaTracker-->>BudgetEnforcer: QuotaCheckResult(allowed=False, reason=..., exhausted_windows=...)
BudgetEnforcer-->>Caller: raise QuotaExhaustedError
else quota OK
QuotaTracker-->>BudgetEnforcer: QuotaCheckResult(allowed=True)
BudgetEnforcer-->>Caller: return (execution allowed)
end
Caller->>QuotaTracker: record_usage(provider_name, requests, tokens)
QuotaTracker->>QuotaTracker: rotate window if boundary crossed
QuotaTracker->>QuotaTracker: accumulate counters
Caller->>QuotaTracker: get_snapshot(provider_name)
QuotaTracker-->>QuotaSnapshot: build QuotaSnapshot(requests_used, tokens_used, ...)
QuotaSnapshot-->>Caller: snapshot with is_exhausted, requests_remaining, tokens_remaining
Last reviewed commit: 05fbbc8 |
There was a problem hiding this comment.
Pull request overview
Adds configurable cost-tier definitions and provider subscription/quota awareness to the budget enforcement layer, introducing new budget-domain models/services and wiring them into config + observability.
Changes:
- Introduces configurable cost tiers (
CostTierDefinition,CostTiersConfig,resolve_tiers(),classify_model_tier()) and exposes them via config and observability events. - Adds subscription/quota domain models (
SubscriptionConfig,QuotaLimit,QuotaSnapshot, etc.) and an async-safeQuotaTrackerfor per-provider, per-window usage tracking. - Integrates quota checks into
BudgetEnforcerpre-flight execution checks and adds associated errors/events/docs/tests.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/observability/test_events.py | Updates domain-module discovery expectations to include new quota events module. |
| tests/unit/config/conftest.py | Extends config factories to include cost_tiers, subscription, and degradation defaults. |
| tests/unit/budget/test_quota_tracker.py | Adds unit tests covering QuotaTracker behavior (recording, rotation, snapshots, validation). |
| tests/unit/budget/test_quota.py | Adds unit tests for quota/subscription models and helper functions. |
| tests/unit/budget/test_enforcer_quota.py | Adds unit tests validating BudgetEnforcer quota integration and error behavior. |
| tests/unit/budget/test_cost_tiers.py | Adds unit tests for cost tier definitions, merging, and classification boundaries. |
| tests/unit/budget/conftest.py | Adds budget test factories/fixtures for cost tiers and quota tracking. |
| src/ai_company/observability/events/quota.py | Adds quota-related observability event constants. |
| src/ai_company/observability/events/budget.py | Adds budget event constants for tier resolution and classification misses. |
| src/ai_company/engine/errors.py | Introduces QuotaExhaustedError for provider quota exhaustion. |
| src/ai_company/engine/init.py | Re-exports QuotaExhaustedError from the engine package. |
| src/ai_company/config/schema.py | Extends provider/root config with subscription, degradation, and cost_tiers fields. |
| src/ai_company/config/defaults.py | Adds default cost_tiers stanza to the default config dict. |
| src/ai_company/budget/quota_tracker.py | Implements QuotaTracker service with window rotation + snapshots + logging. |
| src/ai_company/budget/quota.py | Adds quota/subscription/degradation models and helpers (window_start, effective_cost_per_1k). |
| src/ai_company/budget/enforcer.py | Wires quota checks into BudgetEnforcer and adds check_quota() API. |
| src/ai_company/budget/cost_tiers.py | Adds cost tier models, built-in tier set, merge logic, and classifier. |
| src/ai_company/budget/init.py | Re-exports new budget APIs (tiers, quota models, tracker). |
| README.md | Updates milestone feature list to mention tiers + quota/subscription tracking. |
| DESIGN_SPEC.md | Documents newly implemented budget/quota/tier capabilities and event/module structure. |
| CLAUDE.md | Updates repo layout description to include tiers and quota/subscription tracking under budget. |
Comments suppressed due to low confidence (1)
src/ai_company/budget/enforcer.py:127
except MemoryError, RecursionError:is invalid syntax in Python 3 and will prevent this module from importing. Use tuple exception syntax (and optionally bind the exception) instead, e.g.except (MemoryError, RecursionError):/as exc.
except BudgetExhaustedError:
raise
except MemoryError, RecursionError: # builtin MemoryError (OOM)
raise
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if provider_name not in self._usage: | ||
| logger.debug( | ||
| QUOTA_USAGE_SKIPPED, | ||
| provider=provider_name, | ||
| reason="no_subscription_config", | ||
| ) | ||
| return |
There was a problem hiding this comment.
QuotaTracker treats providers with a SubscriptionConfig but no quotas the same as unknown providers (provider_name not in self._usage) and logs reason="no_subscription_config". This is misleading in observability; consider distinguishing "unknown provider" from "no quotas configured" (or key _usage off subscriptions and check sub_config.quotas separately).
src/ai_company/budget/cost_tiers.py
Outdated
| Provides configurable metadata for cost tiers: price ranges, display | ||
| properties, and model-to-tier classification. The built-in ``CostTier`` | ||
| enum (``core.enums``) remains for backward compatibility; this module | ||
| adds a configurable layer on top. |
There was a problem hiding this comment.
The module docstring references the built-in CostTier enum as (core.enums), but in this repo the enum lives in ai_company.core.enums. Updating the reference avoids pointing readers to a non-existent module path.
tests/unit/budget/test_quota.py
Outdated
| def test_defaults_to_now(self) -> None: | ||
| """Uses current time when now is not provided.""" | ||
| result = window_start(QuotaWindow.PER_DAY) | ||
| now = datetime.now(UTC) | ||
| assert result.day == now.day |
There was a problem hiding this comment.
This test can be flaky around UTC midnight because it calls window_start() and then separately calls datetime.now(UTC) for comparison; the day could roll over between the two calls. Prefer passing a fixed now= into window_start() (or freezing time) and asserting the full expected value.
DESIGN_SPEC.md
Outdated
| | **Personality compatibility scoring** | Adopted (M3) | Weighted composite: 60% Big Five similarity (openness, conscientiousness, agreeableness, stress_response → 1−\|diff\|; extraversion → tent-function peaking at 0.3 diff), 20% collaboration alignment (ordinal adjacency: INDEPENDENT↔PAIR↔TEAM), 20% conflict approach (constructive pairs score 1.0, destructive pairs 0.2, mixed 0.4–0.6). `itertools.combinations` for team-level averaging. Result clamped to [0, 1]. | Covers behavioral diversity (extraversion complement), task alignment (conscientiousness similarity), and interpersonal friction (conflict approach). Weights are configurable module constants. | | ||
| | **Agent behavior testing** | Planned (M3) | Scripted `FakeProvider` for unit tests (deterministic turn sequences); behavioral outcome assertions for integration tests (task completed, tools called, cost within budget). | Leverages existing `FakeProvider` and `CompletionResponseFactory` fixtures. Precise engine testing without brittle response-matching at integration level. | | ||
| | **LLM call analytics** | Adopted (incremental) | M3: proxy metrics (`turns_per_task`, `tokens_per_task`) — adopted. M4 data models: call categorization (`productive`, `coordination`, `system`), category analytics, coordination metrics, orchestration ratio — adopted. M4 runtime collection pipeline and M5+ full analytics: planned. | Append-only, never blocks execution. Builds on existing `CostRecord` infrastructure. Detects orchestration overhead early. See §10.5. | | ||
| | **Cost tiers & quota tracking** | Adopted (M5) | Configurable `CostTier` definitions with merge/override semantics via `resolve_tiers(defaults, overrides)`. `SubscriptionConfig` + `QuotaLimit` model per-provider subscription plans. `QuotaTracker` enforces per-provider request/token quotas with window-based rotation. `DegradationConfig` controls behavior when quotas are approached. | Enables cost classification without hardcoding vendor tiers. Quota tracking prevents surprise overages at the provider level. Window-based rotation aligns quota resets with billing periods. See §10.4. | |
There was a problem hiding this comment.
DESIGN_SPEC says cost tiers are merged via resolve_tiers(defaults, overrides), but the implementation added in this PR exposes resolve_tiers(config: CostTiersConfig) instead. Update the spec to match the actual API so readers don’t implement against the wrong signature.
There was a problem hiding this comment.
Code Review
This pull request introduces a significant and well-structured set of features for budget management, including configurable cost tiers, provider-level quota tracking, and integration with the existing BudgetEnforcer. The new Pydantic models for quotas and cost tiers are robust, and the QuotaTracker service is well-implemented with attention to concurrency safety. The accompanying tests are comprehensive. I have identified one critical syntax error and one high-severity concern regarding the fail-open behavior of the budget check.
| await self._check_provider_quota(agent_id, provider_name) | ||
| except BudgetExhaustedError: | ||
| raise | ||
| except MemoryError, RecursionError: # builtin MemoryError (OOM) |
There was a problem hiding this comment.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/ai_company/budget/enforcer.py (1)
117-134:⚠️ Potential issue | 🟠 MajorDon't fail open when the quota subsystem itself errors.
This broad fallback now includes
_check_provider_quota(). Before quota integration, the allow-on-error path only weakened spend preflight; with quotas inside the sametry, a tracker/config bug silently disables provider-cap enforcement altogether.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/ai_company/budget/enforcer.py` around lines 117 - 134, The current try/except wraps _check_monthly_hard_stop, _check_daily_limit, and _check_provider_quota so any exception in _check_provider_quota silently falls back to allow execution; move _check_provider_quota out of the broad try (or give it its own tight try that only catches BudgetExhaustedError and MemoryError/RecursionError) so provider quota subsystem errors are not swallowed—adjust calls to _check_monthly_hard_stop, _check_daily_limit, and _check_provider_quota accordingly to ensure only the preflight spend checks may fall back, while quota errors propagate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@DESIGN_SPEC.md`:
- Line 1697: The spec overstates runtime behavior for
ProviderConfig.degradation: either narrow DegradationConfig documentation to
reflect current behavior (that QuotaExhaustedError still always raises and
FALLBACK/QUEUE routing is not implemented) and update the QuotaExhaustedError
docstring to explicitly state the limitation, or implement the missing runtime
behaviors (FALLBACK/QUEUE auto-downgrade or rejection) to match the spec; in
either case update DESIGN_SPEC.md and the QuotaExhaustedError docstring
consistently and add an explicit note (per coding guidelines) that alerts users
when the implementation deviates from the spec and why.
In `@src/ai_company/budget/enforcer.py`:
- Around line 94-99: check_can_execute currently performs a zero-token quota
check because it calls _check_provider_quota()/check_quota() without passing
through an estimated token count; thread an estimated_tokens argument from
check_can_execute into any calls to _check_provider_quota and ultimately to the
quota check method (check_quota) so the preflight uses the projected usage, and
update the other call sites referenced (the blocks around the other occurrences
of _check_provider_quota / check_quota noted at the other ranges) to pass the
same estimated_tokens value to maintain consistent token-based caps enforcement.
In `@src/ai_company/budget/quota_tracker.py`:
- Around line 88-157: check_quota() and record_usage() have a TOCTOU gap: under
concurrency multiple coroutines can each pass check_quota() then race to
record_usage(), oversubscribing limits. Fix by introducing an atomic admission
method (e.g. try_consume or reserve_and_consume) that acquires self._lock,
computes remaining capacity using the same window logic (window_start via
window_start(...) and current from self._usage[provider_name]), and if capacity
suffices increments the counters and returns success; update callers to use this
new atomic method instead of calling check_quota() then record_usage(), and
leave check_quota() as a pure read-only inspector (or remove it) so all
admission decisions are made via the single locked path in the new method; also
adjust record_usage() so it’s either private for non-admission bookkeeping or
folded into the atomic consume implementation.
- Around line 345-370: The token-path currently treats a request that exactly
reaches max_tokens as exhausted because both _is_window_exhausted and
_build_exhaustion_reason use "projected >= quota.max_tokens"; change those
checks to use a strict ">" comparison (i.e., projected > quota.max_tokens) so a
request that exactly fills the remaining tokens is allowed, and only requests
that would exceed the token limit are denied; update the comparisons in both
_is_window_exhausted (the return expression) and _build_exhaustion_reason (the
token append condition that uses the local projected variable) to reflect this
strict comparison.
In `@src/ai_company/budget/quota.py`:
- Around line 305-347: The window_start function currently reads calendar fields
from the provided now datetime and then tags the result as UTC, which yields
wrong boundaries for non-UTC-aware datetimes; update window_start (and
references to QuotaWindow.PER_MINUTE / PER_HOUR / PER_DAY) to first reject naive
datetimes (raise a ValueError if now.tzinfo is None or now.tzinfo.utcoffset(now)
is None) and then normalize/convert the timestamp to UTC via now =
now.astimezone(UTC) before extracting year/month/day/hour/minute to build the
window-start datetime.
In `@src/ai_company/config/schema.py`:
- Around line 202-209: RootConfig currently accepts
degradation.fallback_providers (a tuple of strings) without verifying they
reference valid provider names, so typos pass config load; update RootConfig
validation to cross-reference each entry in degradation.fallback_providers
against the canonical provider registry/list (the same source used for routing
model refs) and raise a validation error for any unknown provider names.
Implement this as a pydantic validator/root_validator inside RootConfig (or a
dedicated validate_fallback_providers method) that iterates over
RootConfig.degradation.fallback_providers, checks membership in the provider
registry, and produces a clear ValidationError listing invalid names so config
load fails fast.
- Around line 206-209: The default_factory currently builds a DegradationConfig
(which defaults to FALLBACK and triggers a validator warning when
fallback_providers is empty), causing CONFIG_VALIDATION_FAILED noise; fix by
making the default a no-op degradation config instead of the FALLBACK
default—either change the Field to use default=None with
Optional[DegradationConfig] or set default_factory to create an explicit
non-degrading config (e.g. default_factory=lambda:
DegradationConfig(mode=DegradationMode.NONE)); update the degradation Field
declaration and imports to reference DegradationConfig and DegradationMode
accordingly so normal ProviderConfig parsing does not emit warnings when the
user did not opt into degradation.
In `@src/ai_company/engine/errors.py`:
- Around line 41-46: The current QuotaExhaustedError class and call sites make
every quota miss terminal; update the logic that raises QuotaExhaustedError to
first inspect DegradationConfig.action (the degradation routing policy) and only
raise a terminal QuotaExhaustedError when the configured action mandates
termination; for FALLBACK or QUEUE actions, either return a non-terminal
signal/result or raise a different non-terminal exception type so the caller can
perform fallback/queue behavior; ensure references to QuotaExhaustedError and
BudgetExhaustedError remain consistent and update any docs/comments to reflect
that raising QuotaExhaustedError now depends on DegradationConfig.action.
In `@tests/unit/budget/test_cost_tiers.py`:
- Around line 64-72: The test allows zero-width tiers but the classifier uses
the half-open rule (min <= cost < max) so price_range_min == price_range_max
will never match; fix by either rejecting equal bounds during CostTierDefinition
validation (add a check in CostTierDefinition.__post_init__ or the existing
validate_tiers function to raise for price_range_min >= price_range_max) or
change the classifier comparison in the function that does the tier matching
(where it currently uses "min <= cost < max") to make the upper bound inclusive
for the final/only tier (use "min <= cost <= max" for that case or otherwise
ensure a deterministic inclusive rule). Ensure you reference CostTierDefinition
and the classifier function/method in cost_tiers.py when applying the change.
- Around line 302-349: Collapse the duplicated boundary tests for
classify_model_tier into a single `@pytest.mark.parametrize` table (reusing the
same parameterization you added around lines 368-400), replacing the separate
functions test_boundary_low_medium, test_boundary_medium_high,
test_boundary_high_premium and the within-range tests with parametrized cases
that include input cost and expected tier; remove the duplicated standalone
tests (the ones at lines 302-349) and ensure the parameter table also adds a
missing negative-cost regression case (e.g., cost = -0.001 -> expected "low") so
all boundaries and the negative-cost scenario are covered by the single
parametrized test for classify_model_tier.
In `@tests/unit/budget/test_enforcer_quota.py`:
- Around line 45-58: The tests flake because _make_quota_tracker() hardcodes
QuotaWindow.PER_MINUTE causing counters to rotate if the minute rolls over
between operations; update the helper to avoid minute windows by either (a)
change the default window in _make_quota_tracker to a longer-lived window like
QuotaWindow.PER_HOUR or (b) add a window parameter to _make_quota_tracker and
use that in the SubscriptionConfig so tests can pass an hour-long window or a
frozen clock; locate and modify the _make_quota_tracker function and any test
usages (tests around lines 104-132 and 241-259) to use the new default or pass
an explicit longer window, or alternatively freeze QuotaTracker's clock when
constructing it to ensure deterministic behavior.
In `@tests/unit/budget/test_quota_tracker.py`:
- Around line 39-40: The tests are flaky because they rely on real clock
rollovers for PER_MINUTE/PER_DAY windows; update the test helpers and affected
tests to use a longer, stable window (e.g., QuotaWindow.PER_HOUR) so counters
won't reset between immediate calls. Concretely, change the _minute_quota
factory (and any analogous _day_quota helpers) to return
QuotaLimit(window=QuotaWindow.PER_HOUR, ...) and update tests referenced in the
ranges (39-40, 81-109, 163-223, 276-333, 406-430) that call record_usage(),
check_quota(), or snapshot immediately to use those PER_HOUR helpers (leaving
explicit rotation/rollover tests using PER_HOUR as-is) so assertions become
deterministic.
In `@tests/unit/budget/test_quota.py`:
- Around line 431-435: The test test_defaults_to_now calls datetime.now(UTC)
separately from window_start(QuotaWindow.PER_DAY), which can flake at UTC
midnight; fix by capturing the current time once and using it for the assertion:
either call now = datetime.now(UTC) before invoking window_start and pass now
into window_start if it accepts a now parameter, or if window_start has no now
param, obtain before = datetime.now(UTC); result =
window_start(QuotaWindow.PER_DAY); after = datetime.now(UTC) and assert that
result.day is either before.day or after.day to tolerate the boundary. Ensure
references to test_defaults_to_now, window_start, QuotaWindow.PER_DAY, and
datetime.now(UTC) are used to locate the change.
---
Outside diff comments:
In `@src/ai_company/budget/enforcer.py`:
- Around line 117-134: The current try/except wraps _check_monthly_hard_stop,
_check_daily_limit, and _check_provider_quota so any exception in
_check_provider_quota silently falls back to allow execution; move
_check_provider_quota out of the broad try (or give it its own tight try that
only catches BudgetExhaustedError and MemoryError/RecursionError) so provider
quota subsystem errors are not swallowed—adjust calls to
_check_monthly_hard_stop, _check_daily_limit, and _check_provider_quota
accordingly to ensure only the preflight spend checks may fall back, while quota
errors propagate.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: e191842a-93c9-427a-b6b7-9ad62d415e2e
📒 Files selected for processing (21)
CLAUDE.mdDESIGN_SPEC.mdREADME.mdsrc/ai_company/budget/__init__.pysrc/ai_company/budget/cost_tiers.pysrc/ai_company/budget/enforcer.pysrc/ai_company/budget/quota.pysrc/ai_company/budget/quota_tracker.pysrc/ai_company/config/defaults.pysrc/ai_company/config/schema.pysrc/ai_company/engine/__init__.pysrc/ai_company/engine/errors.pysrc/ai_company/observability/events/budget.pysrc/ai_company/observability/events/quota.pytests/unit/budget/conftest.pytests/unit/budget/test_cost_tiers.pytests/unit/budget/test_enforcer_quota.pytests/unit/budget/test_quota.pytests/unit/budget/test_quota_tracker.pytests/unit/config/conftest.pytests/unit/observability/test_events.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Agent
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (5)
!(DESIGN_SPEC.md|.claude/**|**/litellm/**)
📄 CodeRabbit inference engine (CLAUDE.md)
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:
example-provider,example-large-001,example-medium-001,example-small-001,large/medium/smallas aliases. Tests must usetest-provider,test-small-001, etc.
Files:
README.mdCLAUDE.mdsrc/ai_company/engine/errors.pysrc/ai_company/config/defaults.pysrc/ai_company/observability/events/budget.pytests/unit/budget/test_enforcer_quota.pysrc/ai_company/observability/events/quota.pysrc/ai_company/engine/__init__.pysrc/ai_company/config/schema.pytests/unit/budget/test_cost_tiers.pytests/unit/config/conftest.pytests/unit/budget/test_quota.pysrc/ai_company/budget/__init__.pytests/unit/observability/test_events.pysrc/ai_company/budget/cost_tiers.pysrc/ai_company/budget/quota_tracker.pytests/unit/budget/test_quota_tracker.pytests/unit/budget/conftest.pyDESIGN_SPEC.mdsrc/ai_company/budget/quota.pysrc/ai_company/budget/enforcer.py
**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Always read
DESIGN_SPEC.mdbefore implementing any feature or planning any issue — the spec is the mandatory starting point for architecture, data models, and behavior. If implementation deviates from the spec, alert the user and explain why — user decides whether to proceed or update the spec. Do NOT silently diverge. When a spec section is referenced, read that section verbatim. When approved deviations occur, updateDESIGN_SPEC.mdto reflect the new reality.
Files:
README.mdCLAUDE.mdDESIGN_SPEC.md
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Do not usefrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations
Useexcept A, B:syntax without parentheses (notexcept (A, B):) — PEP 758 except syntax, enforced by ruff on Python 3.14
Add type hints to all public functions and classes — mypy strict mode enforced
Use Google-style docstrings (required on all public classes and functions) — enforced by ruff D rules
Enforce immutability: create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), usecopy.deepcopy()at construction +MappingProxyTypewrapping for read-only enforcement. For dict/list fields in frozen Pydantic models, rely onfrozen=Truefor field reassignment prevention andcopy.deepcopy()at system boundaries
Separate config (frozen Pydantic models) from runtime state (mutable-via-copy models usingmodel_copy(update=...)). Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 with adopted conventions: use@computed_fieldfor derived values instead of storing redundant fields; useNotBlankStr(fromcore.types) for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations (e.g., multiple tool invocations, parallel agent calls) — prefer structured concurrency over barecreate_task
Keep functions under 50 lines and files under 800 lines
Handle errors explicitly — never silently swallow exceptions
Validate input at system boundaries (user input, external APIs, config files)
Set line length to 88 characters (ruff configured)
Files:
src/ai_company/engine/errors.pysrc/ai_company/config/defaults.pysrc/ai_company/observability/events/budget.pytests/unit/budget/test_enforcer_quota.pysrc/ai_company/observability/events/quota.pysrc/ai_company/engine/__init__.pysrc/ai_company/config/schema.pytests/unit/budget/test_cost_tiers.pytests/unit/config/conftest.pytests/unit/budget/test_quota.pysrc/ai_company/budget/__init__.pytests/unit/observability/test_events.pysrc/ai_company/budget/cost_tiers.pysrc/ai_company/budget/quota_tracker.pytests/unit/budget/test_quota_tracker.pytests/unit/budget/conftest.pysrc/ai_company/budget/quota.pysrc/ai_company/budget/enforcer.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Every module with business logic must have:from ai_company.observability import get_loggerfollowed bylogger = get_logger(__name__)— never useimport logging/logging.getLogger()/print()
Use event name constants from domain-specific modules underai_company.observability.events(e.g.,PROVIDER_CALL_STARTfromevents.provider,BUDGET_RECORD_ADDEDfromevents.budget). Import directly:from ai_company.observability.events.<domain> import EVENT_CONSTANT
Use structured logging with kwargs: alwayslogger.info(EVENT, key=value)— neverlogger.info("msg %s", val)
Log all error paths at WARNING or ERROR level with context before raising exceptions
Log all state transitions at INFO level
Use DEBUG logging for object creation, internal flow, and entry/exit of key functions
Files:
src/ai_company/engine/errors.pysrc/ai_company/config/defaults.pysrc/ai_company/observability/events/budget.pysrc/ai_company/observability/events/quota.pysrc/ai_company/engine/__init__.pysrc/ai_company/config/schema.pysrc/ai_company/budget/__init__.pysrc/ai_company/budget/cost_tiers.pysrc/ai_company/budget/quota_tracker.pysrc/ai_company/budget/quota.pysrc/ai_company/budget/enforcer.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slowto categorize tests
Maintain 80% minimum code coverage (enforced in CI with--cov-fail-under=80)
Useasyncio_mode = "auto"in pytest configuration — no manual@pytest.mark.asyncioneeded on async tests
Set test timeout to 30 seconds per test
Usepytest-xdistvia-n autofor parallel test execution
Prefer@pytest.mark.parametrizefor testing similar cases instead of multiple nearly-identical tests
Files:
tests/unit/budget/test_enforcer_quota.pytests/unit/budget/test_cost_tiers.pytests/unit/config/conftest.pytests/unit/budget/test_quota.pytests/unit/observability/test_events.pytests/unit/budget/test_quota_tracker.pytests/unit/budget/conftest.py
🧠 Learnings (1)
📚 Learning: 2026-03-09T10:20:23.072Z
Learnt from: CR
Repo: Aureliolo/ai-company PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-09T10:20:23.072Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`
Applied to files:
src/ai_company/observability/events/budget.pysrc/ai_company/observability/events/quota.pytests/unit/observability/test_events.pyDESIGN_SPEC.mdsrc/ai_company/budget/enforcer.py
🧬 Code graph analysis (10)
src/ai_company/engine/__init__.py (1)
src/ai_company/engine/errors.py (1)
QuotaExhaustedError(41-46)
src/ai_company/config/schema.py (2)
src/ai_company/budget/cost_tiers.py (1)
CostTiersConfig(85-116)src/ai_company/budget/quota.py (2)
DegradationConfig(163-201)SubscriptionConfig(77-146)
tests/unit/budget/test_cost_tiers.py (1)
src/ai_company/budget/cost_tiers.py (4)
CostTierDefinition(23-82)CostTiersConfig(85-116)classify_model_tier(202-244)resolve_tiers(163-199)
tests/unit/config/conftest.py (2)
src/ai_company/budget/cost_tiers.py (1)
CostTiersConfig(85-116)src/ai_company/budget/quota.py (2)
DegradationConfig(163-201)SubscriptionConfig(77-146)
tests/unit/budget/test_quota.py (1)
src/ai_company/budget/quota.py (12)
DegradationAction(149-160)DegradationConfig(163-201)ProviderCostModel(62-74)QuotaCheckResult(273-302)QuotaLimit(30-59)QuotaWindow(21-27)SubscriptionConfig(77-146)effective_cost_per_1k(350-370)window_start(305-347)requests_remaining(242-250)tokens_remaining(254-262)is_exhausted(266-270)
src/ai_company/budget/cost_tiers.py (1)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)
src/ai_company/budget/quota_tracker.py (3)
src/ai_company/budget/quota.py (5)
QuotaCheckResult(273-302)QuotaLimit(30-59)QuotaSnapshot(204-270)QuotaWindow(21-27)window_start(305-347)src/ai_company/observability/_logger.py (1)
get_logger(8-28)src/ai_company/budget/enforcer.py (1)
check_quota(142-170)
tests/unit/budget/conftest.py (4)
src/ai_company/budget/cost_tiers.py (2)
CostTierDefinition(23-82)CostTiersConfig(85-116)src/ai_company/budget/enums.py (1)
BudgetAlertLevel(6-16)src/ai_company/budget/quota.py (3)
QuotaLimit(30-59)QuotaWindow(21-27)SubscriptionConfig(77-146)src/ai_company/budget/quota_tracker.py (1)
QuotaTracker(49-342)
src/ai_company/budget/quota.py (1)
src/ai_company/observability/_logger.py (1)
get_logger(8-28)
src/ai_company/budget/enforcer.py (3)
src/ai_company/budget/quota.py (1)
QuotaCheckResult(273-302)src/ai_company/engine/errors.py (2)
BudgetExhaustedError(24-34)QuotaExhaustedError(41-46)src/ai_company/budget/quota_tracker.py (2)
QuotaTracker(49-342)check_quota(159-252)
🪛 LanguageTool
README.md
[typographical] ~26-~26: To join two clauses or introduce examples, consider using an em dash.
Context: ...n failures - Budget Enforcement (M5) - BudgetEnforcer service with pre-flight...
(DASH_RULE)
🔇 Additional comments (4)
src/ai_company/observability/events/budget.py (1)
30-31: LGTM.The new tier event constants fit the existing naming scheme and give the cost-tier path dedicated observability hooks.
src/ai_company/observability/events/quota.py (1)
5-11: LGTM.The quota event surface is small, specific, and matches the tracker/check operations added in this PR.
tests/unit/budget/conftest.py (1)
108-126: LGTM.The new factories and
make_quota_tracker()helper centralize quota/cost-tier setup and keep the tests on generic provider/model identifiers.Also applies to: 266-277
src/ai_company/budget/__init__.py (1)
32-57: LGTM.The package exports stay in sync with the new cost-tier and quota modules, which makes the public budget surface coherent from
ai_company.budget.Also applies to: 67-112
| async def record_usage( | ||
| self, | ||
| provider_name: str, | ||
| *, | ||
| requests: int = 1, | ||
| tokens: int = 0, | ||
| ) -> None: | ||
| """Record usage against all configured windows for a provider. | ||
|
|
||
| Rotates window counters if a window boundary has been crossed. | ||
| Providers with no subscription config are skipped with a DEBUG log. | ||
|
|
||
| Args: | ||
| provider_name: Provider to record usage for. | ||
| requests: Number of requests to record (must be >= 0). | ||
| tokens: Number of tokens to record (must be >= 0). | ||
|
|
||
| Raises: | ||
| ValueError: If requests or tokens is negative. | ||
| """ | ||
| if requests < 0: | ||
| msg = f"requests must be non-negative, got {requests}" | ||
| raise ValueError(msg) | ||
| if tokens < 0: | ||
| msg = f"tokens must be non-negative, got {tokens}" | ||
| raise ValueError(msg) | ||
|
|
||
| if provider_name not in self._usage: | ||
| logger.debug( | ||
| QUOTA_USAGE_SKIPPED, | ||
| provider=provider_name, | ||
| reason="no_subscription_config", | ||
| ) | ||
| return | ||
|
|
||
| async with self._lock: | ||
| now = datetime.now(UTC) | ||
| provider_usage = self._usage[provider_name] | ||
|
|
||
| for window_type in list(provider_usage): | ||
| current = provider_usage[window_type] | ||
| expected_start = window_start(window_type, now=now) | ||
|
|
||
| if expected_start != current.window_start: | ||
| # Window boundary crossed — rotate | ||
| provider_usage[window_type] = _WindowUsage( | ||
| requests=requests, | ||
| tokens=tokens, | ||
| window_start=expected_start, | ||
| ) | ||
| logger.debug( | ||
| QUOTA_WINDOW_ROTATED, | ||
| provider=provider_name, | ||
| window=window_type.value, | ||
| old_start=str(current.window_start), | ||
| new_start=str(expected_start), | ||
| ) | ||
| else: | ||
| provider_usage[window_type] = _WindowUsage( | ||
| requests=current.requests + requests, | ||
| tokens=current.tokens + tokens, | ||
| window_start=current.window_start, | ||
| ) | ||
|
|
||
| logger.debug( | ||
| QUOTA_USAGE_RECORDED, | ||
| provider=provider_name, | ||
| requests=requests, | ||
| tokens=tokens, | ||
| ) |
There was a problem hiding this comment.
check_quota() and record_usage() still have a TOCTOU gap.
These are separate locked operations. Under load, multiple coroutines can all observe the same remaining capacity, all pass check_quota(), and only later increment the counters in record_usage(). That allows the tracker to oversubscribe a window even though each individual call looked safe. If this is meant to enforce provider caps, the admission step needs an atomic reserve/consume path under one lock.
Also applies to: 159-243
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/ai_company/budget/quota_tracker.py` around lines 88 - 157, check_quota()
and record_usage() have a TOCTOU gap: under concurrency multiple coroutines can
each pass check_quota() then race to record_usage(), oversubscribing limits. Fix
by introducing an atomic admission method (e.g. try_consume or
reserve_and_consume) that acquires self._lock, computes remaining capacity using
the same window logic (window_start via window_start(...) and current from
self._usage[provider_name]), and if capacity suffices increments the counters
and returns success; update callers to use this new atomic method instead of
calling check_quota() then record_usage(), and leave check_quota() as a pure
read-only inspector (or remove it) so all admission decisions are made via the
single locked path in the new method; also adjust record_usage() so it’s either
private for non-admission bookkeeping or folded into the atomic consume
implementation.
- Change DegradationConfig default strategy from FALLBACK to ALERT - Return None (not 0) from requests_remaining/tokens_remaining for unlimited - Add UTC validation to window_start() — reject naive datetimes - Thread estimated_tokens through check_can_execute → _check_provider_quota - Reject zero-width cost tiers (min == max with finite max) - Add RootConfig cross-validation of degradation fallback_providers - Change token exhaustion check from >= to > (allow exact-fill) - Distinguish "unknown provider" vs "no quotas" in QuotaTracker logs - Add TOCTOU gap documentation to QuotaTracker class docstring - Document error-handling asymmetry in BudgetEnforcer.check_quota - Add allow_inf_nan=False to QuotaCheckResult model config - Add logging before raise in model validators and input validation - Fix module docstring reference (core.enums → ai_company.core.enums) - Fix enum docstrings: Attributes → Members for StrEnum classes - Fix DESIGN_SPEC: resolve_tiers signature, CostTierDefinition name, degradation implementation status - Collapse duplicate boundary tests into parametrize table - Add negative-cost test, naive-datetime test, fix midnight race - Change test helpers from PER_MINUTE to PER_HOUR (avoid flakiness) - Strengthen test_provider_without_quotas_not_tracked assertions
| logger.debug( | ||
| QUOTA_WINDOW_ROTATED, | ||
| provider=provider_name, | ||
| window=window_type.value, | ||
| old_start=str(current.window_start), | ||
| new_start=str(expected_start), | ||
| ) |
There was a problem hiding this comment.
Window rotation logged at DEBUG, violating the INFO-for-state-transitions convention
CLAUDE.md states: "All state transitions must log at INFO". A quota window rotation is a meaningful state transition — it resets accumulated counters, directly affecting subsequent enforcement decisions. Logging it at DEBUG means ops teams running at INFO level would miss quota resets entirely.
QUOTA_CHECK_DENIED is correctly logged at INFO; the rotation that un-blocks a provider is equally significant and should match:
logger.info(
QUOTA_WINDOW_ROTATED,
provider=provider_name,
window=window_type.value,
old_start=str(current.window_start),
new_start=str(expected_start),
)Rule Used: CLAUDE.md (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/budget/quota_tracker.py
Line: 162-168
Comment:
**Window rotation logged at `DEBUG`, violating the INFO-for-state-transitions convention**
`CLAUDE.md` states: *"All state transitions must log at INFO"*. A quota window rotation is a meaningful state transition — it resets accumulated counters, directly affecting subsequent enforcement decisions. Logging it at `DEBUG` means ops teams running at `INFO` level would miss quota resets entirely.
`QUOTA_CHECK_DENIED` is correctly logged at `INFO`; the rotation that un-blocks a provider is equally significant and should match:
```python
logger.info(
QUOTA_WINDOW_ROTATED,
provider=provider_name,
window=window_type.value,
old_start=str(current.window_start),
new_start=str(expected_start),
)
```
**Rule Used:** CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=6816cd03-d0e1-4fd0-9d04-2417487a584c))
How can I resolve this? If you propose a fix, please make it concise.| def effective_cost_per_1k( | ||
| cost_per_1k_input: float, | ||
| cost_per_1k_output: float, | ||
| cost_model: ProviderCostModel, | ||
| ) -> float: | ||
| """Compute effective cost per 1k tokens based on cost model. | ||
|
|
||
| Returns 0.0 for SUBSCRIPTION and LOCAL models (pre-paid / free). | ||
| Returns ``cost_per_1k_input + cost_per_1k_output`` for PER_TOKEN. | ||
|
|
||
| Args: | ||
| cost_per_1k_input: Cost per 1k input tokens. | ||
| cost_per_1k_output: Cost per 1k output tokens. | ||
| cost_model: The provider's cost model. | ||
|
|
||
| Returns: | ||
| Effective cost per 1k tokens. | ||
| """ | ||
| if cost_model in (ProviderCostModel.SUBSCRIPTION, ProviderCostModel.LOCAL): | ||
| return 0.0 | ||
| return cost_per_1k_input + cost_per_1k_output |
There was a problem hiding this comment.
effective_cost_per_1k silently accepts and propagates negative cost components
For PER_TOKEN models, the function returns cost_per_1k_input + cost_per_1k_output without any validation. A negative input cost produces a negative-or-reduced total, which classify_model_tier handles by returning None (only catches cost_per_1k_total < 0) — but a partially-negative combination (e.g., -0.001 + 0.005 = 0.004) would silently classify as "medium" even though a negative input cost has no valid domain meaning. This is explicitly tested and documented as "returns the sum as-is", but the function offers no guard and no docstring note about this edge case, which could surprise callers who feed unchecked provider config values.
Consider at minimum adding a note to the docstring, or a log warning (matching CLAUDE.md's "All error paths must log at WARNING") if either component is negative:
if cost_per_1k_input < 0 or cost_per_1k_output < 0:
logger.warning(
BUDGET_TIER_CLASSIFY_MISS,
cost_per_1k_input=cost_per_1k_input,
cost_per_1k_output=cost_per_1k_output,
reason="negative_cost_component",
)
return cost_per_1k_input + cost_per_1k_outputPrompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/budget/quota.py
Line: 377-397
Comment:
**`effective_cost_per_1k` silently accepts and propagates negative cost components**
For `PER_TOKEN` models, the function returns `cost_per_1k_input + cost_per_1k_output` without any validation. A negative input cost produces a negative-or-reduced total, which `classify_model_tier` handles by returning `None` (only catches `cost_per_1k_total < 0`) — but a partially-negative combination (e.g., `-0.001 + 0.005 = 0.004`) would silently classify as `"medium"` even though a negative input cost has no valid domain meaning. This is explicitly tested and documented as "returns the sum as-is", but the function offers no guard and no docstring note about this edge case, which could surprise callers who feed unchecked provider config values.
Consider at minimum adding a note to the docstring, or a log warning (matching CLAUDE.md's "All error paths must log at WARNING") if either component is negative:
```python
if cost_per_1k_input < 0 or cost_per_1k_output < 0:
logger.warning(
BUDGET_TIER_CLASSIFY_MISS,
cost_per_1k_input=cost_per_1k_input,
cost_per_1k_output=cost_per_1k_output,
reason="negative_cost_component",
)
return cost_per_1k_input + cost_per_1k_output
```
How can I resolve this? If you propose a fix, please make it concise.| def is_exhausted(self) -> bool: | ||
| """Whether any enforced limit in this window is exhausted.""" | ||
| if self.requests_limit > 0 and self.requests_used >= self.requests_limit: | ||
| return True | ||
| return self.tokens_limit > 0 and self.tokens_used >= self.tokens_limit |
There was a problem hiding this comment.
is_exhausted and _is_window_exhausted disagree on the token-exact-fill boundary
QuotaSnapshot.is_exhausted considers the window exhausted when tokens_used >= tokens_limit — "at-limit" equals exhausted. However, the enforcement predicate _is_window_exhausted in quota_tracker.py uses usage.tokens + estimated_tokens > quota.max_tokens — "at-limit with no projected tokens" is not exhausted.
When all tokens have been consumed and a check is made with no estimated projection, the snapshot reports is_exhausted=True but QuotaTracker.check_quota returns allowed=True. A caller consuming the snapshot API would reasonably expect these two signals to agree.
This is especially relevant since check_can_execute calls _check_provider_quota without forwarding estimated_tokens (it defaults to zero), meaning the at-limit state would pass the pre-flight check even though the snapshot already shows exhaustion.
Consider documenting the divergence in the is_exhausted docstring (explaining it is a conservative display signal, not the enforcement predicate), or aligning the two by switching is_exhausted to use a strict > comparison for the token check to match _is_window_exhausted.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ai_company/budget/quota.py
Line: 278-282
Comment:
**`is_exhausted` and `_is_window_exhausted` disagree on the token-exact-fill boundary**
`QuotaSnapshot.is_exhausted` considers the window exhausted when `tokens_used >= tokens_limit` — "at-limit" equals exhausted. However, the enforcement predicate `_is_window_exhausted` in `quota_tracker.py` uses `usage.tokens + estimated_tokens > quota.max_tokens` — "at-limit with no projected tokens" is **not** exhausted.
When all tokens have been consumed and a check is made with no estimated projection, the snapshot reports `is_exhausted=True` but `QuotaTracker.check_quota` returns `allowed=True`. A caller consuming the snapshot API would reasonably expect these two signals to agree.
This is especially relevant since `check_can_execute` calls `_check_provider_quota` without forwarding `estimated_tokens` (it defaults to zero), meaning the at-limit state would pass the pre-flight check even though the snapshot already shows exhaustion.
Consider documenting the divergence in the `is_exhausted` docstring (explaining it is a conservative display signal, not the enforcement predicate), or aligning the two by switching `is_exhausted` to use a strict `>` comparison for the token check to match `_is_window_exhausted`.
How can I resolve this? If you propose a fix, please make it concise.🤖 I have created a release *beep* *boop* --- ## [0.1.1](ai-company-v0.1.0...ai-company-v0.1.1) (2026-03-10) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- ## [0.1.0](v0.0.0...v0.1.0) (2026-03-11) ### Features * add autonomy levels and approval timeout policies ([#42](#42), [#126](#126)) ([#197](#197)) ([eecc25a](eecc25a)) * add CFO cost optimization service with anomaly detection, reports, and approval decisions ([#186](#186)) ([a7fa00b](a7fa00b)) * add code quality toolchain (ruff, mypy, pre-commit, dependabot) ([#63](#63)) ([36681a8](36681a8)) * add configurable cost tiers and subscription/quota-aware tracking ([#67](#67)) ([#185](#185)) ([9baedfa](9baedfa)) * add container packaging, Docker Compose, and CI pipeline ([#269](#269)) ([435bdfe](435bdfe)), closes [#267](#267) * add coordination error taxonomy classification pipeline ([#146](#146)) ([#181](#181)) ([70c7480](70c7480)) * add cost-optimized, hierarchical, and auction assignment strategies ([#175](#175)) ([ce924fa](ce924fa)), closes [#173](#173) * add design specification, license, and project setup ([8669a09](8669a09)) * add env var substitution and config file auto-discovery ([#77](#77)) ([7f53832](7f53832)) * add FastestStrategy routing + vendor-agnostic cleanup ([#140](#140)) ([09619cb](09619cb)), closes [#139](#139) * add HR engine and performance tracking ([#45](#45), [#47](#47)) ([#193](#193)) ([2d091ea](2d091ea)) * add issue auto-search and resolution verification to PR review skill ([#119](#119)) ([deecc39](deecc39)) * add mandatory JWT + API key authentication ([#256](#256)) ([c279cfe](c279cfe)) * add memory retrieval, ranking, and context injection pipeline ([#41](#41)) ([873b0aa](873b0aa)) * add pluggable MemoryBackend protocol with models, config, and events ([#180](#180)) ([46cfdd4](46cfdd4)) * add pluggable MemoryBackend protocol with models, config, and events ([#32](#32)) ([46cfdd4](46cfdd4)) * add pluggable output scan response policies ([#263](#263)) ([b9907e8](b9907e8)) * add pluggable PersistenceBackend protocol with SQLite implementation ([#36](#36)) ([f753779](f753779)) * add progressive trust and promotion/demotion subsystems ([#43](#43), [#49](#49)) ([3a87c08](3a87c08)) * add retry handler, rate limiter, and provider resilience ([#100](#100)) ([b890545](b890545)) * add SecOps security agent with rule engine, audit log, and ToolInvoker integration ([#40](#40)) ([83b7b6c](83b7b6c)) * add shared org memory and memory consolidation/archival ([#125](#125), [#48](#48)) ([4a0832b](4a0832b)) * design unified provider interface ([#86](#86)) ([3e23d64](3e23d64)) * expand template presets, rosters, and add inheritance ([#80](#80), [#81](#81), [#84](#84)) ([15a9134](15a9134)) * implement agent runtime state vs immutable config split ([#115](#115)) ([4cb1ca5](4cb1ca5)) * implement AgentEngine core orchestrator ([#11](#11)) ([#143](#143)) ([f2eb73a](f2eb73a)) * implement AuditRepository for security audit log persistence ([#279](#279)) ([94bc29f](94bc29f)) * implement basic tool system (registry, invocation, results) ([#15](#15)) ([c51068b](c51068b)) * implement built-in file system tools ([#18](#18)) ([325ef98](325ef98)) * implement communication foundation — message bus, dispatcher, and messenger ([#157](#157)) ([8e71bfd](8e71bfd)) * implement company template system with 7 built-in presets ([#85](#85)) ([cbf1496](cbf1496)) * implement conflict resolution protocol ([#122](#122)) ([#166](#166)) ([e03f9f2](e03f9f2)) * implement core entity and role system models ([#69](#69)) ([acf9801](acf9801)) * implement crash recovery with fail-and-reassign strategy ([#149](#149)) ([e6e91ed](e6e91ed)) * implement engine extensions — Plan-and-Execute loop and call categorization ([#134](#134), [#135](#135)) ([#159](#159)) ([9b2699f](9b2699f)) * implement enterprise logging system with structlog ([#73](#73)) ([2f787e5](2f787e5)) * implement graceful shutdown with cooperative timeout strategy ([#130](#130)) ([6592515](6592515)) * implement hierarchical delegation and loop prevention ([#12](#12), [#17](#17)) ([6be60b6](6be60b6)) * implement LiteLLM driver and provider registry ([#88](#88)) ([ae3f18b](ae3f18b)), closes [#4](#4) * implement LLM decomposition strategy and workspace isolation ([#174](#174)) ([aa0eefe](aa0eefe)) * implement meeting protocol system ([#123](#123)) ([ee7caca](ee7caca)) * implement message and communication domain models ([#74](#74)) ([560a5d2](560a5d2)) * implement model routing engine ([#99](#99)) ([d3c250b](d3c250b)) * implement parallel agent execution ([#22](#22)) ([#161](#161)) ([65940b3](65940b3)) * implement per-call cost tracking service ([#7](#7)) ([#102](#102)) ([c4f1f1c](c4f1f1c)) * implement personality injection and system prompt construction ([#105](#105)) ([934dd85](934dd85)) * implement single-task execution lifecycle ([#21](#21)) ([#144](#144)) ([c7e64e4](c7e64e4)) * implement subprocess sandbox for tool execution isolation ([#131](#131)) ([#153](#153)) ([3c8394e](3c8394e)) * implement task assignment subsystem with pluggable strategies ([#172](#172)) ([c7f1b26](c7f1b26)), closes [#26](#26) [#30](#30) * implement task decomposition and routing engine ([#14](#14)) ([9c7fb52](9c7fb52)) * implement Task, Project, Artifact, Budget, and Cost domain models ([#71](#71)) ([81eabf1](81eabf1)) * implement tool permission checking ([#16](#16)) ([833c190](833c190)) * implement YAML config loader with Pydantic validation ([#59](#59)) ([ff3a2ba](ff3a2ba)) * implement YAML config loader with Pydantic validation ([#75](#75)) ([ff3a2ba](ff3a2ba)) * initialize project with uv, hatchling, and src layout ([39005f9](39005f9)) * initialize project with uv, hatchling, and src layout ([#62](#62)) ([39005f9](39005f9)) * Litestar REST API, WebSocket feed, and approval queue (M6) ([#189](#189)) ([29fcd08](29fcd08)) * make TokenUsage.total_tokens a computed field ([#118](#118)) ([c0bab18](c0bab18)), closes [#109](#109) * parallel tool execution in ToolInvoker.invoke_all ([#137](#137)) ([58517ee](58517ee)) * testing framework, CI pipeline, and M0 gap fixes ([#64](#64)) ([f581749](f581749)) * wire all modules into observability system ([#97](#97)) ([f7a0617](f7a0617)) ### Bug Fixes * address Greptile post-merge review findings from PRs [#170](https://github.com/Aureliolo/ai-company/issues/170)-[#175](https://github.com/Aureliolo/ai-company/issues/175) ([#176](#176)) ([c5ca929](c5ca929)) * address post-merge review feedback from PRs [#164](https://github.com/Aureliolo/ai-company/issues/164)-[#167](https://github.com/Aureliolo/ai-company/issues/167) ([#170](#170)) ([3bf897a](3bf897a)), closes [#169](#169) * enforce strict mypy on test files ([#89](#89)) ([aeeff8c](aeeff8c)) * harden Docker sandbox, MCP bridge, and code runner ([#50](#50), [#53](#53)) ([d5e1b6e](d5e1b6e)) * harden git tools security + code quality improvements ([#150](#150)) ([000a325](000a325)) * harden subprocess cleanup, env filtering, and shutdown resilience ([#155](#155)) ([d1fe1fb](d1fe1fb)) * incorporate post-merge feedback + pre-PR review fixes ([#164](#164)) ([c02832a](c02832a)) * pre-PR review fixes for post-merge findings ([#183](#183)) ([26b3108](26b3108)) * resolve circular imports, bump litellm, fix release tag format ([#286](#286)) ([a6659b5](a6659b5)) * strengthen immutability for BaseTool schema and ToolInvoker boundaries ([#117](#117)) ([7e5e861](7e5e861)) ### Performance * harden non-inferable principle implementation ([#195](#195)) ([02b5f4e](02b5f4e)), closes [#188](#188) ### Refactoring * adopt NotBlankStr across all models ([#108](#108)) ([#120](#120)) ([ef89b90](ef89b90)) * extract _SpendingTotals base class from spending summary models ([#111](#111)) ([2f39c1b](2f39c1b)) * harden BudgetEnforcer with error handling, validation extraction, and review fixes ([#182](#182)) ([c107bf9](c107bf9)) * harden personality profiles, department validation, and template rendering ([#158](#158)) ([10b2299](10b2299)) * pre-PR review improvements for ExecutionLoop + ReAct loop ([#124](#124)) ([8dfb3c0](8dfb3c0)) * split events.py into per-domain event modules ([#136](#136)) ([e9cba89](e9cba89)) ### Documentation * add ADR-001 memory layer evaluation and selection ([#178](#178)) ([db3026f](db3026f)), closes [#39](#39) * add agent scaling research findings to DESIGN_SPEC ([#145](#145)) ([57e487b](57e487b)) * add CLAUDE.md, contributing guide, and dev documentation ([#65](#65)) ([55c1025](55c1025)), closes [#54](#54) * add crash recovery, sandboxing, analytics, and testing decisions ([#127](#127)) ([5c11595](5c11595)) * address external review feedback with MVP scope and new protocols ([#128](#128)) ([3b30b9a](3b30b9a)) * expand design spec with pluggable strategy protocols ([#121](#121)) ([6832db6](6832db6)) * finalize 23 design decisions (ADR-002) ([#190](#190)) ([8c39742](8c39742)) * update project docs for M2.5 conventions and add docs-consistency review agent ([#114](#114)) ([99766ee](99766ee)) ### Tests * add e2e single agent integration tests ([#24](#24)) ([#156](#156)) ([f566fb4](f566fb4)) * add provider adapter integration tests ([#90](#90)) ([40a61f4](40a61f4)) ### CI/CD * add Release Please for automated versioning and GitHub Releases ([#278](#278)) ([a488758](a488758)) * bump actions/checkout from 4 to 6 ([#95](#95)) ([1897247](1897247)) * bump actions/upload-artifact from 4 to 7 ([#94](#94)) ([27b1517](27b1517)) * bump anchore/scan-action from 6.5.1 to 7.3.2 ([#271](#271)) ([80a1c15](80a1c15)) * bump docker/build-push-action from 6.19.2 to 7.0.0 ([#273](#273)) ([dd0219e](dd0219e)) * bump docker/login-action from 3.7.0 to 4.0.0 ([#272](#272)) ([33d6238](33d6238)) * bump docker/metadata-action from 5.10.0 to 6.0.0 ([#270](#270)) ([baee04e](baee04e)) * bump docker/setup-buildx-action from 3.12.0 to 4.0.0 ([#274](#274)) ([5fc06f7](5fc06f7)) * bump sigstore/cosign-installer from 3.9.1 to 4.1.0 ([#275](#275)) ([29dd16c](29dd16c)) * harden CI/CD pipeline ([#92](#92)) ([ce4693c](ce4693c)) * split vulnerability scans into critical-fail and high-warn tiers ([#277](#277)) ([aba48af](aba48af)) ### Maintenance * add /worktree skill for parallel worktree management ([#171](#171)) ([951e337](951e337)) * add design spec context loading to research-link skill ([8ef9685](8ef9685)) * add post-merge-cleanup skill ([#70](#70)) ([f913705](f913705)) * add pre-pr-review skill and update CLAUDE.md ([#103](#103)) ([92e9023](92e9023)) * add research-link skill and rename skill files to SKILL.md ([#101](#101)) ([651c577](651c577)) * bump aiosqlite from 0.21.0 to 0.22.1 ([#191](#191)) ([3274a86](3274a86)) * bump pyyaml from 6.0.2 to 6.0.3 in the minor-and-patch group ([#96](#96)) ([0338d0c](0338d0c)) * bump ruff from 0.15.4 to 0.15.5 ([a49ee46](a49ee46)) * fix M0 audit items ([#66](#66)) ([c7724b5](c7724b5)) * **main:** release ai-company 0.1.1 ([#282](#282)) ([2f4703d](2f4703d)) * pin setup-uv action to full SHA ([#281](#281)) ([4448002](4448002)) * post-audit cleanup — PEP 758, loggers, bug fixes, refactoring, tests, hookify rules ([#148](#148)) ([c57a6a9](c57a6a9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Signed-off-by: Aurelio <19254254+Aureliolo@users.noreply.github.com>
Summary
budget/cost_tiers.py):CostTierDefinitionmodel with price ranges, display properties, andclassify_model_tier()function. Built-in 4-tier defaults (low/medium/high/premium) with user override/merge viaresolve_tiers()budget/quota.py):SubscriptionConfig,QuotaLimit,QuotaWindow,ProviderCostModel,DegradationConfig,QuotaSnapshot,QuotaCheckResult— frozen Pydantic models for provider subscription/quota configurationbudget/quota_tracker.py):QuotaTrackerwith per-provider, per-window (minute/hour/day/month) request and token tracking, automatic window rotation, async-safe viaasyncio.Lockcheck_can_execute()now includes provider quota checks;check_quota()public API;QuotaExhaustedErrorraised on exhaustionProviderConfiggainssubscription+degradationfields;RootConfiggainscost_tiersfieldCloses #67
Test plan
Review coverage
Pre-reviewed by 9 agents (code-reviewer, python-reviewer, pr-test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, resilience-audit, docs-consistency). 34 findings addressed, 0 skipped.