Skip to content

Implement budget enforcement engine with task-boundary auto-downgrade (DESIGN_SPEC §10.4) #44

@Aureliolo

Description

@Aureliolo

Context

Implement budget enforcement per DESIGN_SPEC §10.4. Since agents make LLM calls that cost real money, the system must enforce spending limits at multiple levels to prevent runaway costs.

Acceptance Criteria

Budget Controls

  • Total monthly budget limit (hard cap)
  • Alert notification at 75% of budget consumed
  • Critical warning at 90% of budget consumed
  • Hard stop at 100% — blocks all further LLM calls
  • Per-task spending limit
  • Per-agent daily spending limit

Enforcement

  • Budget check performed before every LLM call
  • Alert/warning emitted at threshold crossings
  • Hard stop blocks LLM calls with clear error message
  • Auto model downgrade at configurable threshold (e.g., switch from Opus to Sonnet)

Auto-Downgrade Boundary (§10.4)

  • Model downgrades apply only at task assignment time, never mid-execution
  • An agent halfway through a task cannot be switched to a cheaper model — the task completes on its assigned model
  • The next task assignment respects the downgrade threshold
  • boundary: "task_assignment" field in auto-downgrade config

Tracking & Reset

  • Per-agent spending tracked and queryable
  • Per-task spending tracked and queryable
  • Monthly budget auto-reset on configured date
  • Spending history retained for reporting

Testing

  • Unit tests for all threshold behaviors (>80% coverage)
  • Test: alert fires at exactly 75%
  • Test: hard stop blocks at exactly 100%
  • Test: auto downgrade triggers correctly at task boundary (not mid-execution)
  • Test: monthly reset works

Dependencies

  • Depends on cost tracking (M2, done)

Design Spec Reference

  • §10.4 — Budget Controls (auto-downgrade boundary clarification)

Updated 2026-03-06: Added auto-downgrade boundary constraint from §10.4 — downgrades at task assignment only, never mid-execution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    prio:criticalBlocks other work, must do firstprio:highImportant, should be prioritizedscope:medium1-3 days of workspec:budgetDESIGN_SPEC Section 10 - Cost & Budget Managementtype:featureNew feature implementation

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions