Skip to content

Prometheus metrics: daily budget %, per-agent budget, per-agent task counts #1148

@Aureliolo

Description

@Aureliolo

Context

PR #1135 implemented the Prometheus /metrics endpoint and OTLP exporter (closing #1122). During the issue-resolution review, three acceptance criteria from #1122 were identified as not yet implementable due to missing infrastructure in CostTracker:

  1. Daily budget utilization % -- Prometheus /metrics endpoint and OTLP exporter #1122 requested "daily %" but CostTracker only exposes get_total_cost() (accumulated lifetime). There is no time-windowed query (e.g. get_cost_since(timestamp)) to compute month-to-date or day-to-date spend.

  2. Per-agent budget utilization -- Prometheus /metrics endpoint and OTLP exporter #1122 requested "per-agent" budget metrics but CostTracker does not expose per-agent cost breakdowns. The budget model (BudgetConfig) defines a global total_monthly limit, not per-agent limits.

  3. Per-agent task counts -- feat: Prometheus /metrics endpoint and OTLP exporter (#1122) #1135 added the agent label to synthorg_tasks_total (using task.assigned_to), but unassigned tasks show as agent="". This works but the original issue mentioned "Task counts (total, by status, by agent)" which implies a richer agent dimension.

Solution

Phase 1: CostTracker time-windowed queries

Add get_cost_since(start: datetime) -> float to CostTracker (or the persistence layer). This enables:

  • synthorg_budget_daily_used_percent -- cost since UTC midnight / daily budget
  • More accurate synthorg_budget_used_percent -- cost since month start / monthly budget (currently uses accumulated total)

Phase 2: Per-agent cost tracking

Add get_cost_by_agent() -> dict[str, float] to CostTracker. This enables:

  • synthorg_agent_cost_total{agent_id} -- per-agent accumulated cost gauge
  • synthorg_agent_budget_used_percent{agent_id} -- per-agent budget utilization (requires per-agent budget limits in BudgetConfig)

Files

  • src/synthorg/budget/cost_tracker.py -- new query methods
  • src/synthorg/persistence/repositories.py -- new repository queries
  • src/synthorg/observability/prometheus_collector.py -- new metric families
  • src/synthorg/budget/config.py -- per-agent budget limits (Phase 2)

Source

Deferred from #1122 issue-resolution review in PR #1135.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions