Skip to content

Prometheus /metrics endpoint and OTLP exporter #1122

@Aureliolo

Description

@Aureliolo

Problem

SynthOrg has no Prometheus /metrics endpoint and no OpenTelemetry (OTLP) exporter. The HTTP batch handler (SinkType.HTTP) ships JSON log records to a URL -- it does not implement the Prometheus exposition format or OTLP gRPC/HTTP.

External monitoring systems (Prometheus/Grafana, Datadog, Honeycomb) need one of:

  • A /metrics endpoint scrapable by Prometheus
  • An OTLP exporter sending traces and metrics to a collector

Without this, SynthOrg cannot participate in standard metrics pipelines or distributed tracing systems. This is the most significant gap for enterprise control-plane positioning.

The 82+ structured event constants and 3-level correlation IDs (request_id, task_id, agent_id) provide the raw material -- the data model is sound, the export format is not.

Source: docs/research/control-plane-audit.md (G1), closes research #688.

Solution

Phase 1 (higher priority): /metrics Prometheus endpoint

Expose key counters and gauges scraped from in-memory state:

  • Task counts (total, by status, by agent)
  • Budget utilization (monthly %, daily %, per-agent)
  • Coordination metrics (efficiency, overhead ratio)
  • Security evaluation counts (by verdict)
  • Agent count (by status, by trust level)

Use prometheus-client or equivalent. Single new route in the API.

Phase 2: OTLP exporter

Add an OtlpSink alongside the existing HttpBatchHandler. Maps structlog events to OTLP spans using the existing correlation IDs as trace/span context. Configurable endpoint, protocol (gRPC vs HTTP/JSON), and export interval.

Files

  • src/synthorg/observability/ -- new sink or metrics aggregator
  • src/synthorg/api/ -- add /metrics route (unauthenticated, standard scrape target)
  • pyproject.toml -- add prometheus-client (or equivalent) dependency

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions