Skip to content

openai-codex credential pool: _sync_codex_entry_from_cli overwrites all entries with the same token #11364

@Nicolas-Formenton

Description

@Nicolas-Formenton

Issue: openai-codex credential pool — _sync_codex_entry_from_cli overwrites all entries with the same token

Summary

When the openai-codex credential pool contains multiple manually-added OAuth accounts, _sync_codex_entry_from_cli() reads tokens from ~/.codex/auth.json (a single-token file) and propagates that one token to all pool entries. This causes the pool to become a set of identical copies of the same account, making credential rotation completely ineffective.

Environment

  • Hermes Agent: current main branch
  • Provider: openai-codex
  • Pool strategy: round_robin
  • 5 manually-added OAuth credentials (device_code flow)
  • CLI mode active during the incident

Steps to Reproduce

  1. Add multiple openai-codex OAuth credentials to the pool via hermes auth add openai-codex (each with a different ChatGPT account via device_code flow)
  2. Verify ~/.hermes/auth.json credential_pool has N entries with different tokens
  3. Have one account hit a 429 (rate limit / usage limit)
  4. Observe that _sync_codex_entry_from_cli is triggered during _available_entries() for exhausted entries
  5. Check ~/.hermes/auth.json — all N entries now have the same access_token and refresh_token

Root Cause

File: agent/credential_pool.py

_sync_codex_entry_from_cli (line ~461):

def _sync_codex_entry_from_cli(self, entry: PooledCredential) -> PooledCredential:
    cli_tokens = _import_codex_cli_tokens()  # reads ~/.codex/auth.json
    if cli_refresh and cli_refresh != entry.refresh_token:
        updated = replace(entry,
            access_token=cli_access,   # <-- overwrites with singleton token
            refresh_token=cli_refresh,  # <-- overwrites with singleton token
        )
        self._replace_entry(entry, updated)
        self._persist()  # <-- writes ALL entries back to auth.json

This is called from _available_entries() (line ~796):

if (self.provider == "openai-codex"
        and entry.last_status == STATUS_EXHAUSTED
        and entry.refresh_token):
    synced = self._sync_codex_entry_from_cli(entry)

The problem: ~/.codex/auth.json is a single-token file maintained by the Codex CLI. It can only hold ONE account's tokens at a time. When _sync_codex_entry_from_cli runs, it overwrites the pool entry's tokens with whatever is in that file — regardless of which account the entry originally represented.

The sync runs for every exhausted entry during _available_entries(). After the first sync writes the singleton token to auth.json (via _persist()), subsequent load_pool() calls read the corrupted state, and further syncs propagate the same token to all remaining entries.

Cascading Effect

  1. Account A hits 429 → marked exhausted
  2. _available_entries() runs → calls _sync_codex_entry_from_cli for account A
  3. ~/.codex/auth.json has tokens from the LAST Codex CLI login (say, account B)
  4. Account A's tokens are overwritten with account B's tokens
  5. _persist() writes the corrupted pool to auth.json
  6. Next session loads pool → all entries that were synced now have account B's token
  7. When account B hits 429, ALL synced entries also hit 429 (they're the same account)
  8. Pool rotation appears to "work" (rotates to next entry) but uses the same token

Evidence

All 5 pool entries had identical tokens:

Entry[0] (label=Account1): access_token hash = abc123...
Entry[1] (label=Account2): access_token hash = abc123...  (same)
Entry[2] (label=Account3): access_token hash = abc123...  (same)
Entry[3] (label=Account4): access_token hash = abc123...  (same)
Entry[4] (label=Account5): access_token hash = abc123...  (same)

~/.codex/auth.json access_token hash = abc123... (same)

All 5 entries shared the same refresh_token (including the 4 "manual:device_code" entries that should have their own unique tokens).

Proposed Fix

Option A (minimal): Only sync the entry whose source == "device_code" (the singleton entry seeded from ~/.codex/auth.json). Never sync manual entries.

if (self.provider == "openai-codex"
        and entry.last_status == STATUS_EXHAUSTED
        and entry.refresh_token
        and entry.source == "device_code"):  # only the singleton
    synced = self._sync_codex_entry_from_cli(entry)

Option B (robust): In _sync_codex_entry_from_cli, only proceed if the entry's current refresh_token matches what ~/.codex/auth.json previously had (i.e., this entry IS the one that was last used by the CLI). If the tokens are completely different accounts, skip the sync.

def _sync_codex_entry_from_cli(self, entry):
    cli_tokens = _import_codex_cli_tokens()
    if not cli_tokens:
        return entry
    cli_refresh = cli_tokens.get("refresh_token", "")
    # Only sync if this entry's refresh_token is a PREVIOUS version
    # of the CLI token (same account, just refreshed by CLI).
    # If completely different, it's a different account — don't touch it.
    if cli_refresh == entry.refresh_token:
        return entry  # already in sync
    # Check if it's the same account (access token shares same subject/JWT sub)
    # If not, skip — this entry represents a different account.
    ...

Option C (nuclear): Remove _sync_codex_entry_from_cli entirely and let the pool's own _refresh_entry handle token rotation. The sync was added as a convenience but introduces data corruption for multi-account setups.

Additional Context

  • The has_retried_429 flow (first 429 doesn't rotate, second does) works correctly — the rotation itself isn't the issue
  • The round_robin strategy works correctly — the issue is that all entries are identical after sync
  • _seed_from_singletons also seeds from providers.openai-codex.tokens (the same singleton), but this is expected behavior for the "device_code" entry — the problem is only with the sync overwriting manual entries

Impact

Any user with multiple openai-codex OAuth accounts in the credential pool will silently lose multi-account rotation after the first token refresh or 429 event. The pool appears to work (logs show rotation) but all entries use the same token, so every account hits the same rate limit.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundarea/authAuthentication, OAuth, credential poolscomp/agentCore agent loop, run_agent.py, prompt builderprovider/openaiOpenAI / Codex Responses APItype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions