Skip to content

companion to devagentic#324: short-circuit retry on cascade_exhausted sentinel + surface trace_id #118

@PowerCreek

Description

@PowerCreek

companion to devagentic#324: short-circuit retry on cascade_exhausted sentinel

Context

devagentic#324 (Phase 1) introduces a server-side recovery cascade that runs INSIDE devagentic when a dispatch returns empty content. The cascade walks alternate models / intents / preamble-strip paths and either recovers, or returns a structured OpenAI-compat error envelope with the sentinel:

{
  "error": {
    "message": "...",
    "code": "devagentic_cascade_exhausted",
    "devagentic": {
      "trace_id": "doc-...",
      "steps_attempted": [...],
      "terminal_outcome": "..."
    }
  }
}

Why hermes-side coordination

Today hermes-cli retries the same dispatch 3x on empty/transient failures. After devagentic#324 lands, those 3 retries happen ON TOP of devagentic's 4-step recovery cascade — worst case 3 × 4 = 12 dispatches per user request, all of which fire R5 + fence + provider calls, blowing through rate budgets (NousResearch#321 cascade compounding).

When devagentic returns error.code == "devagentic_cascade_exhausted", hermes should:

  • Treat the response as terminal (NO retry)
  • Surface the structured error to the user / orchestrator with the trace_id so post-mortem is single-call (dispatchTrace(trace_id) GraphQL query against devagentic)

Scope

Per-provider behavior in the hermes provider plugin layer that handles devagentic-local:

  • Detect body.error.code == "devagentic_cascade_exhausted" on a 200-with-error-body response
  • Skip the retry loop for THIS dispatch
  • Display the error.message + the trace_id in the user-facing surface
  • Telemetry: log + count cascade-exhausted responses so operators can see the rate

Composition

  • devagentic#324 — server-side cascade that emits the sentinel
  • devagentic#321 — 429 retry-cascade tracker; this issue closes the hermes-side half
  • devagentic#314 — broader model-reliability investigation

Sequencing

Land after devagentic#324 Phase 1 deploys + emits its first cascade_exhausted response in production. Without that, this issue has no live producer to consume from. hermes-maintainer can stub against the documented envelope before NousResearch#324 ships to land in parallel.

Owner

hermes-maintainer (per devagentic#203 §1.3 — hermes-side retry semantics + provider plugin layer).

DoD

  • Provider plugin for devagentic-local recognizes the cascade_exhausted sentinel and short-circuits retry
  • User-facing surface shows the structured error with trace_id (clickable / copyable)
  • Test: simulated cascade_exhausted response → no retry attempted, error surfaced cleanly

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions