Skip to content

[Bug][auth] Refresh token rotation not persisted — RT reuse triggers session revocation across all profiles (v0.11.0) #15099

@camelludo

Description

@camelludo

Bug Description

On Hermes Agent v0.11.0 (SHA acdcb167), the Nous Portal refresh-token chain dies within 4–10 days on every profile, and the failure mode indicates the client is not persisting the rotated refresh_token. Nous's OAuth server treats reuse of a previously-rotated RT as a token-theft signal and revokes the entire session chain.

Today I diagnosed this incident on 4 profiles simultaneously (Pedro / Selim / Omar / Atlas) — 7 credentials total across their pools, all failing. Below is the reproduction evidence.

Environment

  • Hermes Agent v0.11.0 (2026.4.23), SHA acdcb167 (0 commits behind origin/main as of this writing)
  • Ubuntu 22.04, Python 3.11.15, running as 4-profile setup (root + 3 sub-profiles under /root/.hermes/profiles/*)
  • Account sub: cmnit8tqn000cl704ac8x2jn8 (Scale tier, $50/mo, subscription_tier: 3)
  • OAuth flow used: hermes auth add nous --type oauth --no-browser (device-code variant)
  • All 4 profiles re-OAuthed across 2026-04-13 → 2026-04-15; all 7 pool entries dead by 2026-04-24
  • Distinct from [Bug]: Paid Scale-tier subscriber — tool_gateway_admin: false, every Tool Gateway call rejected with AUTH_ERROR #14435 (that's Tool Gateway server-side provisioning; this is client-side RT rotation)

7-row failure matrix (from live POST /api/oauth/token with grant_type=refresh_token)

profile pool # cred age (h) server response
pedro 0 116.8 invalid_grant: Refresh session has been revoked
selim 0 99.0 invalid_grant: Refresh token reuse detected; please re-authenticate
selim 1 3.0 invalid_grant: Refresh session has been revoked
omar 0 116.8 invalid_grant: Refresh session has been revoked
omar 1 99.0 invalid_grant: Refresh session has been revoked
atlas 0 116.8 invalid_grant: Refresh session has been revoked
atlas 1 99.0 invalid_grant: Refresh session has been revoked

The Selim #0 response — "Refresh token reuse detected" — is the smoking gun. Nous's server is telling us the client tried to use an RT that had already been rotated and retired. The other 6 rows are the consequence: after the server detected RT reuse, the entire session chain was revoked, cascading through every subsequent refresh attempt.

The OAuth 2.1 spec (section 6.1) says: "The authorization server MUST… issue a new refresh token, in which case it MUST revoke the previous refresh token after a new refresh token is issued." This is what Nous is implementing. For clients to coexist with this, they must persist the newly-issued RT from each refresh response and discard the old one.

Why the v0.10.0 "real fix" commits did not prevent this

The v0.9.0 → v0.10.0 release notes and my upgrade tracking pointed to three commits as the auth persistence fix:

These address pool dedup and provider-pointer mirroring. None of them touch the code path that handles the refresh_response → write back the NEW refresh_token to auth.json. If that code path silently drops the new RT, every subsequent refresh uses the now-retired old RT, which Nous treats as reuse.

Request: can someone on the Hermes team confirm whether _refresh_nous_access_token (or equivalent) is persisting the refresh_token field from the response JSON back into credential_pool.nous[active].refresh_token? If it's only persisting access_token, that's the bug.

Secondary issue discovered in the same diagnosis

hermes auth add nous --type oauth --no-browser on v0.11.0 writes pool entries without obtained_at or agent_key_obtained_at populated (fields show None in the saved JSON). Other fields like expires_at, access_token, refresh_token, label, agent_key are all populated correctly. This is a regression compared to older credentials in the same pool — which have obtained_at set to a real ISO 8601 timestamp.

Downstream impact: any tool that sorts/prunes pool entries using obtained_at as the freshness signal will silently treat fresh credentials as "oldest" and may evict them. I hit this myself in a custom self-heal hook that uses obtained_at to pick the freshest pool entry — fresh credentials were being auto-pruned on every gateway restart.

Suggested fix: ensure the code path that constructs the pool entry in auth_add_command / _nous_device_code_login populates obtained_at and agent_key_obtained_at with the current UTC timestamp when the mint completes.

What we did to unblock ourselves

  1. Re-OAuthed all 4 profiles via hermes auth add nous --type oauth --no-browser
  2. Patched our custom self-heal hook to fall back to expires_at when obtained_at is missing
  3. Restarted all 4 gateways; verified /v1/chat/completions returns HTTP 200 and /api/oauth/token mints fresh tokens successfully

Working state confirmed as of 2026-04-24.

Reproducer steps

  1. Start with a clean Hermes v0.11.0 setup, any subscription tier
  2. Run hermes auth add nous --type oauth on any profile, approve via browser
  3. Leave the gateway running for 4–10 days with moderate traffic (any channel that triggers _refresh_nous_access_token a few times)
  4. Eventually POST /api/oauth/token grant_type=refresh_token will return invalid_grant: Refresh token reuse detected — typically around the mark when the 2nd or 3rd refresh happens
  5. All subsequent calls cascade to Refresh session has been revoked

Willing to help

Happy to share full JWT payloads, OAuth response bodies, or patch a candidate fix against hermes_cli/auth.py if someone on the Nous team can point me at the right function. The fact that this hit 4 independent profiles within 10 days on the same account suggests a deterministic bug, not a flaky edge case.

Tagging for visibility: @someone-on-auth (please re-tag the right maintainer).

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundarea/authAuthentication, OAuth, credential poolscomp/cliCLI entry point, hermes_cli/, setup wizardtype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions