Skip to content

Provider runtime health observability #5451

@kshitijk4poor

Description

@kshitijk4poor

Problem

There is no visibility into provider health during a session. The cooldown tracker (if merged) records failures for circuit-breaking, but there is no success/error rate tracking, latency measurement, or health summary for diagnostics.

Proposed Solution

Extend the provider tracking to record per-provider stats: success count, error count, average latency, last error reason. Expose via /status slash command and hermes status CLI. Lightweight in-memory counters, no persistence needed. Pairs with the cooldown tracker as the observability layer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3Low — cosmetic, nice to havecomp/agentCore agent loop, run_agent.py, prompt buildertype/featureNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions