Summary
agent/credential_pool.py uses a threading.Lock for synchronization, but multiple processes (CLI, gateway, delegate agents) can read/write the same pool JSON file concurrently. The lock is process-local only — there is no cross-process coordination.
Root Cause
_available_entries() (lines ~746-800) calls _sync_anthropic_entry_from_credentials_file() and _sync_codex_entry_from_cli(), each of which calls _replace_entry() and _persist() internally. Multiple _persist() calls happen within a single _available_entries() iteration, writing the full pool to disk each time.
If a second process reads the pool file between two _persist() calls, it sees a partially-synced state where some entries have been updated and others haven't.
Impact
- Under high concurrency (multiple gateway workers + CLI), pool file corruption is possible
- Partially-synced credential state can lead to using stale/revoked tokens
- OAuth refresh tokens (which are single-use) can be consumed by one process while another still references the old token
Suggested Fix
Use file-level locking (fcntl.flock or equivalent, matching the pattern already used in hermes_cli/auth.py) around the pool file read/write operations. Alternatively, batch all _persist() calls within _available_entries() into a single write at the end.
Severity
Medium — race condition under multi-process contention; unlikely in single-process CLI usage.
Summary
agent/credential_pool.pyuses athreading.Lockfor synchronization, but multiple processes (CLI, gateway, delegate agents) can read/write the same pool JSON file concurrently. The lock is process-local only — there is no cross-process coordination.Root Cause
_available_entries()(lines ~746-800) calls_sync_anthropic_entry_from_credentials_file()and_sync_codex_entry_from_cli(), each of which calls_replace_entry()and_persist()internally. Multiple_persist()calls happen within a single_available_entries()iteration, writing the full pool to disk each time.If a second process reads the pool file between two
_persist()calls, it sees a partially-synced state where some entries have been updated and others haven't.Impact
Suggested Fix
Use file-level locking (
fcntl.flockor equivalent, matching the pattern already used inhermes_cli/auth.py) around the pool file read/write operations. Alternatively, batch all_persist()calls within_available_entries()into a single write at the end.Severity
Medium — race condition under multi-process contention; unlikely in single-process CLI usage.