Summary
When an OpenAI-Codex OAuth credential is revoked or invalidated upstream, Hermes marks it exhausted with a 1-hour TTL cooldown. After the TTL expires, the broken credential re-enters the rotation pool and fails again — usually on context compression in long sessions, where the failure surfaces as Failed to generate context summary and breaks the session's compaction.
Cooldown semantics are correct for transient errors (429, 5xx, quota throttles). They are wrong for permanent OAuth states like token_invalidated and token_revoked.
Repro
Today (2026-05-26) on a fresh install, observed 7 separate 401 token_invalidated failures from the same revoked Codex OAuth credential between 10:27 and 17:50 UTC:
Failed to generate context summary:
Error code: 401 - {'error': {'message': 'Your authentication token has been invalidated. Please try signing in again.',
'type': 'invalid_request_error',
'code': 'token_invalidated',
'param': None},
'status': 401}
Removing the credential manually via hermes auth → option 2 → remove openai-codex #1 silenced the failures temporarily, but a fresh credential under the same label Hermes Agent Codex re-appeared in the pool later (separate re-auth flow possibly, or the cooldown re-rotating from a stale auth.json entry — needs upstream confirmation).
Root cause
In hermes-agent credential pool logic:
EXHAUSTED_TTL_DEFAULT_SECONDS = 60 * 60
A 401 token_invalidated from the model provider takes the same exhausted code path as a 429 rate_limit — both get a 1-hour TTL, after which the credential is re-considered eligible. This means a permanently-revoked OAuth token will keep getting picked back up.
Expected behavior
token_invalidated and token_revoked should transition the credential to a terminal dead state — never re-enter rotation until the credential is explicitly re-added or refreshed by the operator. Other 401 codes (e.g. token_expired if Hermes can refresh) should keep cooldown semantics, but _invalidated / _revoked cannot be auto-recovered.
Suggested fix
Extend the credential pool state machine to include a dead state alongside exhausted:
- 429 / 503 / network errors →
exhausted with TTL cooldown (current behavior, correct)
- 401 with
code == 'token_invalidated' or code == 'token_revoked' → dead, no auto-recovery
- Successful OAuth refresh on a
dead credential → transition back to ok
dead credentials excluded from pick_for_provider() unconditionally
hermes auth UI surfaces dead separately so the operator can see why it's offline
This mirrors the pattern already implemented in some open-source job-queue libraries (e.g., Sidekiq's dead set vs. retry set).
Workaround for affected users
Manual: hermes auth → option 2 → remove the openai-codex credential whose last_status: exhausted and last_error_reason: token_invalidated. Do NOT switch auxiliary.compression.provider to a different model provider as a workaround if that other provider is per-token-billed (e.g., Anthropic API) — at agentic-run query volume the bill gets expensive fast.
Environment
- Hermes config:
model.provider = openai-codex, model.default = gpt-5.5, auxiliary.compression.provider = auto
- Pool: 3 openai-codex credentials, 1 revoked, 2 healthy
- Failure surfaces in
~/.hermes/logs/errors.log as Failed to generate context summary: 401 token_invalidated
Summary
When an OpenAI-Codex OAuth credential is revoked or invalidated upstream, Hermes marks it
exhaustedwith a 1-hour TTL cooldown. After the TTL expires, the broken credential re-enters the rotation pool and fails again — usually on context compression in long sessions, where the failure surfaces asFailed to generate context summaryand breaks the session's compaction.Cooldown semantics are correct for transient errors (
429,5xx, quota throttles). They are wrong for permanent OAuth states liketoken_invalidatedandtoken_revoked.Repro
Today (2026-05-26) on a fresh install, observed 7 separate
401 token_invalidatedfailures from the same revoked Codex OAuth credential between 10:27 and 17:50 UTC:Removing the credential manually via
hermes auth→ option 2 → remove openai-codex #1 silenced the failures temporarily, but a fresh credential under the same labelHermes Agent Codexre-appeared in the pool later (separate re-auth flow possibly, or the cooldown re-rotating from a staleauth.jsonentry — needs upstream confirmation).Root cause
In
hermes-agentcredential pool logic:A
401 token_invalidatedfrom the model provider takes the sameexhaustedcode path as a429 rate_limit— both get a 1-hour TTL, after which the credential is re-considered eligible. This means a permanently-revoked OAuth token will keep getting picked back up.Expected behavior
token_invalidatedandtoken_revokedshould transition the credential to a terminaldeadstate — never re-enter rotation until the credential is explicitly re-added or refreshed by the operator. Other 401 codes (e.g.token_expiredif Hermes can refresh) should keep cooldown semantics, but_invalidated/_revokedcannot be auto-recovered.Suggested fix
Extend the credential pool state machine to include a
deadstate alongsideexhausted:exhaustedwith TTL cooldown (current behavior, correct)code == 'token_invalidated'orcode == 'token_revoked'→dead, no auto-recoverydeadcredential → transition back tookdeadcredentials excluded frompick_for_provider()unconditionallyhermes authUI surfacesdeadseparately so the operator can see why it's offlineThis mirrors the pattern already implemented in some open-source job-queue libraries (e.g., Sidekiq's
deadset vs.retryset).Workaround for affected users
Manual:
hermes auth→ option 2 → remove the openai-codex credential whoselast_status: exhaustedandlast_error_reason: token_invalidated. Do NOT switchauxiliary.compression.providerto a different model provider as a workaround if that other provider is per-token-billed (e.g., Anthropic API) — at agentic-run query volume the bill gets expensive fast.Environment
model.provider = openai-codex,model.default = gpt-5.5,auxiliary.compression.provider = auto~/.hermes/logs/errors.logasFailed to generate context summary: 401 token_invalidated