Skip to content

fix(tests): isolate credential tests from macOS Keychain#22692

Closed
wesleysimplicio wants to merge 1 commit into
NousResearch:mainfrom
wesleysimplicio:fix/ag02-test-read-claude-code-credentials-keychain-isolation
Closed

fix(tests): isolate credential tests from macOS Keychain#22692
wesleysimplicio wants to merge 1 commit into
NousResearch:mainfrom
wesleysimplicio:fix/ag02-test-read-claude-code-credentials-keychain-isolation

Conversation

@wesleysimplicio

@wesleysimplicio wesleysimplicio commented May 9, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Problem

`read_claude_code_credentials()` checks the macOS Keychain before falling back to `~/.claude/.credentials.json`. On a developer machine that has run `claude login`, the Keychain holds a live OAuth token. Three test classes patched `Path.home()` but never suppressed `_read_claude_code_credentials_from_keychain`, so the real token silently won the precedence race:

Test Observed failure
`TestReadClaudeCodeCredentials` (5 tests) Real Keychain token returned instead of the expected file token or `None`
`TestResolveAnthropicToken::test_falls_back_to_claude_code_credentials` Keychain token was expired → refresh failed → function returned `None` instead of `"cc-auto-token"`
`TestRunOauthSetupToken::test_returns_token_from_credential_files` Global `subprocess.run` mock intercepted the Keychain reader's subprocess call; `json.loads(MagicMock)` raised `TypeError`

Fix

Add an `autouse=True` `_no_keychain` fixture to each affected class that uses `monkeypatch` to replace `_read_claude_code_credentials_from_keychain` with `lambda: None`.

Add `TestReadClaudeCodeCredentialsFromKeychain` with two dedicated tests that exercise the Keychain-first priority with a controlled mock token dict.

Tests

All 152 tests in `tests/agent/test_anthropic_adapter.py` pass (was 147 pass + 5 fail before this patch).

Reproduces on any macOS machine where `claude login` has been run.

🤖 Generated with Claude Code

Solution Sketch

  • fix the root cause in the touched subsystem instead of layering a broad workaround around the symptom
  • keep surrounding behavior stable and avoid unrelated refactors while the area is under review
  • prove the change with focused checks on the exact path that regressed

Related Issue

N/A

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • preserved the existing technical rationale and validation notes inside the template body
  • scoped this PR description to the implementation already present on the branch
  • aligned the delivery format with .github/PULL_REQUEST_TEMPLATE.md

How to Test

  1. Review the existing validation notes preserved in this PR body.
  2. Run the focused checks for the touched area.
  3. Confirm the scoped change still behaves as described above.

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform:

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

Screenshots / Logs

  • N/A.

Problem:
read_claude_code_credentials() checks the macOS Keychain first on Darwin
(via _read_claude_code_credentials_from_keychain).  On a developer machine
that has authenticated with Claude Code, the Keychain holds a real OAuth
token.  Tests that only mocked Path.home() never suppressed the Keychain
lookup, so the real token silently won the precedence race.

Five tests in TestReadClaudeCodeCredentials and one each in
TestResolveAnthropicToken / TestRunOauthSetupToken were affected:
- Tests expecting None received real Keychain credentials.
- test_falls_back_to_claude_code_credentials received the (possibly
  expired) Keychain token and resolve_anthropic_token() returned None
  after a failed refresh rather than "cc-auto-token" from the file.
- test_returns_token_from_credential_files: globally-patched subprocess.run
  was also intercepted by the Keychain reader, causing json.loads to fail
  with TypeError: MagicMock is not str.

Root cause:
Missing test isolation for _read_claude_code_credentials_from_keychain.

Fix:
Add an autouse=True _no_keychain fixture to the three affected test classes.
The fixture uses monkeypatch to replace _read_claude_code_credentials_from_keychain
with a no-op lambda so every test in those classes sees None from the Keychain
path.

Add TestReadClaudeCodeCredentialsFromKeychain with two new tests that
explicitly exercise the Keychain-first priority by mocking the function to
return a controlled token dict.

Tests:
- All 152 tests in test_anthropic_adapter.py pass (was 147 pass + 5 fail).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 9, 2026 16:17

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes Anthropic adapter tests deterministic on macOS by preventing accidental reads from a developer’s real macOS Keychain (which can contain live Claude Code OAuth credentials), while adding targeted tests for the Keychain-precedence behavior under controlled mocks.

Changes:

  • Add autouse=True fixtures to suppress _read_claude_code_credentials_from_keychain() in three existing test classes so file-based and subprocess-mocking tests don’t race against real Keychain state.
  • Add a new test class that explicitly verifies “Keychain wins over file” and “fallback to file when Keychain empty” behaviors via controlled mocking.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +222 to +274
class TestReadClaudeCodeCredentialsFromKeychain:
"""Verify that a populated Keychain entry takes priority over the JSON file."""

def test_keychain_creds_take_priority_over_file(self, tmp_path, monkeypatch):
"""When Keychain returns credentials, the JSON file is never consulted."""
kc_token = {
"accessToken": "sk-ant-oat01-keychain",
"refreshToken": "kc-refresh",
"expiresAt": int(time.time() * 1000) + 3600_000,
"source": "macos_keychain",
}
monkeypatch.setattr(
"agent.anthropic_adapter._read_claude_code_credentials_from_keychain",
lambda: kc_token,
)
# Even if a JSON file with different data exists, Keychain wins.
cred_file = tmp_path / ".claude" / ".credentials.json"
cred_file.parent.mkdir(parents=True)
cred_file.write_text(json.dumps({
"claudeAiOauth": {
"accessToken": "sk-ant-oat01-from-file",
"refreshToken": "file-refresh",
"expiresAt": int(time.time() * 1000) + 3600_000,
}
}))
monkeypatch.setattr("agent.anthropic_adapter.Path.home", lambda: tmp_path)
creds = read_claude_code_credentials()
assert creds is not None
assert creds["accessToken"] == "sk-ant-oat01-keychain"
assert creds["source"] == "macos_keychain"

def test_falls_back_to_file_when_keychain_empty(self, tmp_path, monkeypatch):
"""When Keychain returns None, the JSON file is used."""
monkeypatch.setattr(
"agent.anthropic_adapter._read_claude_code_credentials_from_keychain",
lambda: None,
)
cred_file = tmp_path / ".claude" / ".credentials.json"
cred_file.parent.mkdir(parents=True)
cred_file.write_text(json.dumps({
"claudeAiOauth": {
"accessToken": "sk-ant-oat01-file-token",
"refreshToken": "file-refresh",
"expiresAt": int(time.time() * 1000) + 3600_000,
}
}))
monkeypatch.setattr("agent.anthropic_adapter.Path.home", lambda: tmp_path)
creds = read_claude_code_credentials()
assert creds is not None
assert creds["accessToken"] == "sk-ant-oat01-file-token"
assert creds["source"] == "claude_code_credentials_file"


@alt-glitch alt-glitch added type/test Test coverage or test infrastructure provider/anthropic Anthropic native Messages API P3 Low — cosmetic, nice to have labels May 9, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #15958 which covers the same Keychain stubbing fix with broader scope (also covers botocore skip tests).

@wesleysimplicio

Copy link
Copy Markdown
Contributor Author

Fechando como duplicado — @alt-glitch confirmou que #15958 cobre o mesmo fix de Keychain stubbing com escopo maior (inclui botocore skip tests). Obrigado pela revisão!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low — cosmetic, nice to have provider/anthropic Anthropic native Messages API type/test Test coverage or test infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants