fix(bench): seed ANTHROPIC_API_KEY in llm_alone dispatch acceptance test by Davidson3556 · Pull Request #2775 · Tracer-Cloud/opensre

Davidson3556 · 2026-06-08T11:14:31Z

Describe the changes you have made in this PR -

test_run_inner_accepts_llm_alone_when_adapter_provides_baseline was failing on every PR opened from a fork. The test calls runner.run_without_integrity(), which activates the LLM via LLMDispatcher.activate() in tests/benchmarks/_framework/llm_dispatch.py:173-180. That activation raises MissingAPIKey if ANTHROPIC_API_KEY is unset. CI on main passes because the secret is injected via .github/workflows/ci.yml:168,370, but GitHub Actions strips repository secrets from fork-PR workflows, so every external contributor's PR hits the same red.

Change: add monkeypatch: pytest.MonkeyPatch to the test and call monkeypatch.setenv("ANTHROPIC_API_KEY", "test-key") before constructing the runner. Same pattern tests/benchmarks/_framework/test_llm_dispatch.py already uses in 5+ places.

The other 6 tests in the file are unaffected: they call _run_one_cell directly with an explicit spec=LLM_SPECS["claude-4-sonnet"] and never trigger activate().

Demo/Screenshot for feature changes and bug fixes -

Before (no key — reproduces the fork-PR CI failure):

After (no key, same command):

Also verified hermetic with a stub key (ANTHROPIC_API_KEY=fake-key set): 7 passed. The test is now green whether the env var is set or unset.

Code Understanding and AI Usage

Did you use AI assistance (ChatGPT, Claude, Copilot, etc.) to write any part of this code?

No, I wrote all the code myself
Yes, I used AI assistance (continue below)

If you used AI assistance:

I have reviewed every single line of the AI-generated code
I can explain the purpose and logic of each function/component I added
I have tested edge cases and understand how the code handles them
I have modified the AI output to follow this project's coding standards and conventions

Explain your implementation approach:

The test is verifying that the runner's pre-flight gate accepts an adapter that returns a non-None baseline_agent_class. The downstream investigation pipeline is patched out via patch("app.pipeline.runners.run_investigation", ...), but the dispatcher's environment-variable check fires before the patch is ever reached. So the test was broken by an upstream concern (env presence) that has nothing to do with what it's actually asserting.

I considered three approaches:

pytest.mark.skipif(not os.getenv("ANTHROPIC_API_KEY")). Would skip the test on fork PRs but also silently drop it on any local dev run without a key. The point of the test is to gate the runner's pre-flight, and that gate should be exercised on every PR, not just ones where someone happens to have a key.
Add ANTHROPIC_API_KEY to the workflow's env: block unconditionally with a fake value. Too invasive: changes CI for every job, and the dummy key value would show up in workflow logs.
monkeypatch.setenv("ANTHROPIC_API_KEY", "test-key") inside the test. Hermetic, scoped to the one test that needs it, automatically reverted after the test by pytest's monkeypatch fixture. Already the established pattern in tests/benchmarks/_framework/test_llm_dispatch.py (5+ uses).

I went with option 3. The dispatcher only checks env-var presence, not key validity, so a dummy "test-key" value is enough. With run_investigation already patched, no real Anthropic API call ever happens, so a fake key is safe.

One thing I deliberately did not change: the dispatcher's MissingAPIKey exception itself. It is correct behavior for production runs (you want to fail loud if the key is missing). The bug was that one specific test was indirectly exercising that production check without the test setup it needed.

Checklist before requesting a review

I have added proper PR title and linked to the issue
I have performed a self-review of my code
I can explain the purpose of every function, class, and logic block I added
I understand why my changes work and have tested them thoroughly
I have considered potential edge cases and how my code handles them
If it is a core feature, I have added thorough tests
My code follows the project's style guidelines and conventions

github-actions · 2026-06-08T11:14:40Z

Greptile code review

This repo uses Greptile for automated review. Before merge, aim for Confidence Score: 5/5 with zero unresolved review threads — see CONTRIBUTING.md.

Run a review — add a PR comment with:

@greptile review

Give it ~5-10 minutes (sometimes longer) for results, then fix feedback and re-trigger until you reach Confidence Score: 5/5.

Optional: automate with the greploop skill.

greptile-apps · 2026-06-08T11:16:13Z

Greptile Summary

This PR fixes a CI failure on fork PRs where test_run_inner_accepts_llm_alone_when_adapter_provides_baseline raised MissingAPIKey because ANTHROPIC_API_KEY is stripped from fork workflow environments by GitHub Actions. The fix adds monkeypatch.setenv("ANTHROPIC_API_KEY", "test-key") before runner.run_without_integrity() is called, matching the pattern already used in five-plus tests in test_llm_dispatch.py.

Adds monkeypatch: pytest.MonkeyPatch to the test signature and injects a dummy key so the dispatcher's env-check passes without triggering a real API call.
The patch is automatically reverted after the test by pytest's monkeypatch fixture, so there is no env leakage to other tests.

Confidence Score: 5/5

Safe to merge — a one-line test setup addition with no production code changes.

The change touches a single test, adds a dummy env var that is automatically cleaned up by pytest's monkeypatch fixture, and mirrors an already-proven pattern used elsewhere in the test suite. No production paths are affected, no real API calls can occur (run_investigation is patched out), and the other six tests in the file are unaffected.

No files require special attention.

Important Files Changed

Filename	Overview
tests/benchmarks/_framework/test_runner_llm_alone_dispatch.py	Adds monkeypatch.setenv("ANTHROPIC_API_KEY", "test-key") to the one test that reaches LLMDispatcher.activate(); the change is minimal, correctly scoped, and follows the established pattern in test_llm_dispatch.py.

Sequence Diagram

sequenceDiagram
    participant T as Test
    participant MP as monkeypatch
    participant R as BenchmarkRunner
    participant D as LLMDispatcher.activate()
    participant CI as run_investigation (patched)

    T->>MP: setenv("ANTHROPIC_API_KEY", "test-key")
    T->>R: BenchmarkRunner(config, adapter)
    T->>R: run_without_integrity()
    R->>R: _run_inner() pre-flight gate
    R->>D: activate("claude-4-sonnet")
    D->>D: Check ANTHROPIC_API_KEY ✓
    D-->>R: spec context
    R->>CI: run_investigation(...) [patched stub]
    CI-->>R: "{root_cause: ok}"
    R-->>T: outcome (not aborted)
    T->>MP: teardown → unset ANTHROPIC_API_KEY

_{Reviews (1): Last reviewed commit: "fix(bench): seed ANTHROPIC_API_KEY in ll..." | Re-trigger Greptile}

Davidson3556 · 2026-06-08T11:17:21Z

@muddlebee @cerencamkiran kindly review

The runner's pre-flight gate test exercises LLMDispatcher.activate(), which requires ANTHROPIC_API_KEY to be set even when the downstream LLM call is patched out. CI on main passes because the secret is injected from secrets.ANTHROPIC_API_KEY, but GitHub Actions strips secrets from fork-PR workflows, so any external contributor sees the test fail. Set a hermetic test-key via monkeypatch.setenv — same pattern as test_llm_dispatch.py uses 5+ times. Test is now hermetic whether ANTHROPIC_API_KEY is set or unset. Closes Tracer-Cloud#2774

github-actions · 2026-06-09T03:42:59Z

🍵 @Davidson3556 made tea, opened a PR, and merged before it cooled. No notes. ☕

👋 Join us on Discord - OpenSRE : hang out, contribute, or hunt for features and issues. Everyone's welcome.

psyberck mentioned this pull request Jun 8, 2026

feat(redis): add redis integration #2699

Closed

13 tasks

Davidson3556 force-pushed the fix/bench-anthropic-api-key-test branch from 0d3bdae to 3657dc1 Compare June 8, 2026 19:57

Devesh36 merged commit 1f5a454 into Tracer-Cloud:main Jun 9, 2026
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(bench): seed ANTHROPIC_API_KEY in llm_alone dispatch acceptance test#2775

fix(bench): seed ANTHROPIC_API_KEY in llm_alone dispatch acceptance test#2775
Devesh36 merged 1 commit into
Tracer-Cloud:mainfrom
Davidson3556:fix/bench-anthropic-api-key-test

Davidson3556 commented Jun 8, 2026

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

greptile-apps Bot commented Jun 8, 2026

Uh oh!

Davidson3556 commented Jun 8, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Davidson3556 commented Jun 8, 2026

Describe the changes you have made in this PR -

Demo/Screenshot for feature changes and bug fixes -

Code Understanding and AI Usage

Checklist before requesting a review

Uh oh!

github-actions Bot commented Jun 8, 2026

Greptile code review

Uh oh!

greptile-apps Bot commented Jun 8, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Davidson3556 commented Jun 8, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants