Problem
SynthOrg has no Prometheus /metrics endpoint and no OpenTelemetry (OTLP) exporter. The HTTP batch handler (SinkType.HTTP) ships JSON log records to a URL -- it does not implement the Prometheus exposition format or OTLP gRPC/HTTP.
External monitoring systems (Prometheus/Grafana, Datadog, Honeycomb) need one of:
- A
/metrics endpoint scrapable by Prometheus
- An OTLP exporter sending traces and metrics to a collector
Without this, SynthOrg cannot participate in standard metrics pipelines or distributed tracing systems. This is the most significant gap for enterprise control-plane positioning.
The 82+ structured event constants and 3-level correlation IDs (request_id, task_id, agent_id) provide the raw material -- the data model is sound, the export format is not.
Source: docs/research/control-plane-audit.md (G1), closes research #688.
Solution
Phase 1 (higher priority): /metrics Prometheus endpoint
Expose key counters and gauges scraped from in-memory state:
- Task counts (total, by status, by agent)
- Budget utilization (monthly %, daily %, per-agent)
- Coordination metrics (efficiency, overhead ratio)
- Security evaluation counts (by verdict)
- Agent count (by status, by trust level)
Use prometheus-client or equivalent. Single new route in the API.
Phase 2: OTLP exporter
Add an OtlpSink alongside the existing HttpBatchHandler. Maps structlog events to OTLP spans using the existing correlation IDs as trace/span context. Configurable endpoint, protocol (gRPC vs HTTP/JSON), and export interval.
Files
src/synthorg/observability/ -- new sink or metrics aggregator
src/synthorg/api/ -- add /metrics route (unauthenticated, standard scrape target)
pyproject.toml -- add prometheus-client (or equivalent) dependency
Problem
SynthOrg has no Prometheus
/metricsendpoint and no OpenTelemetry (OTLP) exporter. The HTTP batch handler (SinkType.HTTP) ships JSON log records to a URL -- it does not implement the Prometheus exposition format or OTLP gRPC/HTTP.External monitoring systems (Prometheus/Grafana, Datadog, Honeycomb) need one of:
/metricsendpoint scrapable by PrometheusWithout this, SynthOrg cannot participate in standard metrics pipelines or distributed tracing systems. This is the most significant gap for enterprise control-plane positioning.
The 82+ structured event constants and 3-level correlation IDs (request_id, task_id, agent_id) provide the raw material -- the data model is sound, the export format is not.
Source:
docs/research/control-plane-audit.md(G1), closes research #688.Solution
Phase 1 (higher priority):
/metricsPrometheus endpointExpose key counters and gauges scraped from in-memory state:
Use
prometheus-clientor equivalent. Single new route in the API.Phase 2: OTLP exporter
Add an
OtlpSinkalongside the existingHttpBatchHandler. Maps structlog events to OTLP spans using the existing correlation IDs as trace/span context. Configurable endpoint, protocol (gRPC vs HTTP/JSON), and export interval.Files
src/synthorg/observability/-- new sink or metrics aggregatorsrc/synthorg/api/-- add/metricsroute (unauthenticated, standard scrape target)pyproject.toml-- addprometheus-client(or equivalent) dependency