companion to devagentic#324: short-circuit retry on cascade_exhausted sentinel
Context
devagentic#324 (Phase 1) introduces a server-side recovery cascade that runs INSIDE devagentic when a dispatch returns empty content. The cascade walks alternate models / intents / preamble-strip paths and either recovers, or returns a structured OpenAI-compat error envelope with the sentinel:
{
"error": {
"message": "...",
"code": "devagentic_cascade_exhausted",
"devagentic": {
"trace_id": "doc-...",
"steps_attempted": [...],
"terminal_outcome": "..."
}
}
}
Why hermes-side coordination
Today hermes-cli retries the same dispatch 3x on empty/transient failures. After devagentic#324 lands, those 3 retries happen ON TOP of devagentic's 4-step recovery cascade — worst case 3 × 4 = 12 dispatches per user request, all of which fire R5 + fence + provider calls, blowing through rate budgets (NousResearch#321 cascade compounding).
When devagentic returns error.code == "devagentic_cascade_exhausted", hermes should:
- Treat the response as terminal (NO retry)
- Surface the structured error to the user / orchestrator with the trace_id so post-mortem is single-call (
dispatchTrace(trace_id) GraphQL query against devagentic)
Scope
Per-provider behavior in the hermes provider plugin layer that handles devagentic-local:
- Detect
body.error.code == "devagentic_cascade_exhausted" on a 200-with-error-body response
- Skip the retry loop for THIS dispatch
- Display the
error.message + the trace_id in the user-facing surface
- Telemetry: log + count cascade-exhausted responses so operators can see the rate
Composition
- devagentic#324 — server-side cascade that emits the sentinel
- devagentic#321 — 429 retry-cascade tracker; this issue closes the hermes-side half
- devagentic#314 — broader model-reliability investigation
Sequencing
Land after devagentic#324 Phase 1 deploys + emits its first cascade_exhausted response in production. Without that, this issue has no live producer to consume from. hermes-maintainer can stub against the documented envelope before NousResearch#324 ships to land in parallel.
Owner
hermes-maintainer (per devagentic#203 §1.3 — hermes-side retry semantics + provider plugin layer).
DoD
- Provider plugin for
devagentic-local recognizes the cascade_exhausted sentinel and short-circuits retry
- User-facing surface shows the structured error with trace_id (clickable / copyable)
- Test: simulated
cascade_exhausted response → no retry attempted, error surfaced cleanly
🤖 Generated with Claude Code
companion to devagentic#324: short-circuit retry on
cascade_exhaustedsentinelContext
devagentic#324 (Phase 1) introduces a server-side recovery cascade that runs INSIDE devagentic when a dispatch returns empty content. The cascade walks alternate models / intents / preamble-strip paths and either recovers, or returns a structured OpenAI-compat error envelope with the sentinel:
{ "error": { "message": "...", "code": "devagentic_cascade_exhausted", "devagentic": { "trace_id": "doc-...", "steps_attempted": [...], "terminal_outcome": "..." } } }Why hermes-side coordination
Today hermes-cli retries the same dispatch 3x on empty/transient failures. After devagentic#324 lands, those 3 retries happen ON TOP of devagentic's 4-step recovery cascade — worst case 3 × 4 = 12 dispatches per user request, all of which fire R5 + fence + provider calls, blowing through rate budgets (NousResearch#321 cascade compounding).
When devagentic returns
error.code == "devagentic_cascade_exhausted", hermes should:dispatchTrace(trace_id)GraphQL query against devagentic)Scope
Per-provider behavior in the hermes provider plugin layer that handles devagentic-local:
body.error.code == "devagentic_cascade_exhausted"on a 200-with-error-body responseerror.message+ the trace_id in the user-facing surfaceComposition
Sequencing
Land after devagentic#324 Phase 1 deploys + emits its first
cascade_exhaustedresponse in production. Without that, this issue has no live producer to consume from. hermes-maintainer can stub against the documented envelope before NousResearch#324 ships to land in parallel.Owner
hermes-maintainer (per devagentic#203 §1.3 — hermes-side retry semantics + provider plugin layer).
DoD
devagentic-localrecognizes thecascade_exhaustedsentinel and short-circuits retrycascade_exhaustedresponse → no retry attempted, error surfaced cleanly🤖 Generated with Claude Code