Skip to content

bug: credential pool reads stale os.environ instead of fresh .env — test documents correct behavior but code doesn't match #20591

@wali-reheman

Description

@wali-reheman

Bug Description

_get_env_prefer_dotenv() in agent/credential_pool.py is supposed to prefer the user's deliberate ~/.hermes/.env config over stale inherited os.environ vars from parent processes (Codex CLI, test scripts, etc.).

The function signature says it prefers .env:

# Prefer ~/.hermes/.env over os.environ — the user's config file is the
# authoritative source for Hermes credentials.
def _get_env_prefer_dotenv(key: str) -> str:
    env_file = load_env()
    val = env_file.get(key) or os.environ.get(key) or ""
    return val.strip()

But in practice, os.environ values leak in through the credential pool initialization, causing stale keys to shadow fresh .env edits.

Evidence

Two tests in tests/tools/test_credential_pool_env_fallback.py and tests/agent/test_credential_pool.py document the correct expected behavior:

# test_load_pool_prefers_dotenv_over_stale_os_environ
"""Regression for #18254: stale OPENROUTER_API_KEY in os.environ (inherited
from a parent shell) must NOT shadow the fresh key in ~/.hermes/.env."""
# Expects: .env key wins
# Actual: stale os.environ key wins

Both tests consistently fail with:

AssertionError: Expected .env to win, got 'sk-or-STALE-from-shell'

Root Cause Analysis

The _get_env_prefer_dotenv function itself appears correct (.env checked before os.environ). The bug is likely in where/how load_hermes_home() resolves during credential pool initialization — in some code paths the HERMES_HOME env var points to the wrong directory, so load_env() reads the wrong .env file or no .env at all, falling through to os.environ.

The test sets HERMES_HOME via monkeypatch but other parts of the credential pool initialization may bypass this and read the real filesystem .env.

Impact

  • After key rotation in ~/.hermes/.env, users get persistent 401 errors because the stale shell-exported key is still being used
  • The issue affects any user who has environment variables exported in their shell profile (common for Codex/Claude CLI users)

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existsarea/authAuthentication, OAuth, credential poolscomp/agentCore agent loop, run_agent.py, prompt builderduplicateThis issue or pull request already existstype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions