Skip to content

fix(fallback): resolve api_key_env in fallback chain entries (carve-out of #22665)#22856

Merged
teknium1 merged 1 commit into
mainfrom
salvage/pr-22665-fallback-only
May 10, 2026
Merged

fix(fallback): resolve api_key_env in fallback chain entries (carve-out of #22665)#22856
teknium1 merged 1 commit into
mainfrom
salvage/pr-22665-fallback-only

Conversation

@teknium1

@teknium1 teknium1 commented May 9, 2026

Copy link
Copy Markdown
Contributor

Summary

Salvage of #22665's fallback portion — fallback chain entries with api_key_env: ENV_VAR_NAME are now resolved correctly at both init time and runtime.

Root cause

run_agent.py had two places (init-time fallback chain construction at line ~1660, runtime _try_activate_fallback at line ~8045) that read fb.get("key_env") but not the documented snake_case alias fb.get("api_key_env"). So a provider: custom fallback with base_url + api_key_env: GOOGLE_API_KEY worked as primary (where the normal _normalize_custom_provider_entry runs and maps the alias) but failed as fallback with "no endpoint credentials found" → 401.

Changes (carve-out from #22665)

  • run_agent.py (~lines 1660 and 8045): add or fb.get("api_key_env") to the existing key_env lookup in both call sites. Empty-string-to-None coercion preserved so unset env vars don't poison the resolver.
  • tests/run_agent/test_fallback_model.py: 3 regression tests in TestFallbackKeyEnvResolution covering both call sites + missing-env case.

Carve-out rationale

The original PR #22665 by @wesleysimplicio bundled this api_key_env fix with two unrelated changes:

  1. Gateway graceful-degrade when no adapters available (Gateway exits with error when any messaging platform fails to connect (should degrade gracefully) #5196 territory) — landed via carve-out fix(gateway): degrade gracefully when all platform adapters are missing (carve-out of #22642) #22853 which is the same code.
  2. Memory-nudge counter hydration (Gateway sessions reset memory nudge counter, so self-improvement review may never trigger #22357 territory) — not mentioned in PR title or description.

This carve-out keeps just the title-described api_key_env work.

Validation

  • 9/9 fallback api_key_env / key_env tests pass on the salvage branch.

Closes #5392 via salvage.

…ut of #22665)

Fallback chain entries with 'api_key_env: ENV_VAR_NAME' weren't being
resolved by either the init-time fallback path (line ~1660) or the
runtime _try_activate_fallback path (line ~8045). Only literal
'api_key' was honored; the snake_case 'api_key_env' alias documented
elsewhere in the config was silently dropped, so a 'provider: custom'
fallback with base_url + api_key_env worked as primary but failed as
fallback with 'no endpoint credentials found' / 401.

Adds 'or fb.get("api_key_env")' to the existing 'key_env' lookup in
both call sites, with empty-string-to-None coercion so unset env vars
don't poison the resolver.

Salvage of #22665's fallback portion. The original PR also bundled
gateway-degrade-on-no-adapters changes (those land via the carve-out
in #22853 which is the same code) and run_agent.py memory-nudge
counter hydration (issue #22357 territory, not mentioned in the
title). Drops both bundled pieces; keeps just the api_key_env fix.

Closes #5392.
@github-actions

github-actions Bot commented May 9, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: salvage/pr-22665-fallback-only vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 7953 on HEAD, 7953 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4201 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder area/auth Authentication, OAuth, credential pools labels May 9, 2026
@teknium1 teknium1 merged commit 6ddc48b into main May 10, 2026
15 of 18 checks passed
@teknium1 teknium1 deleted the salvage/pr-22665-fallback-only branch May 10, 2026 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/auth Authentication, OAuth, credential pools comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Custom/Google provider not resolved in fallback_model and fallback_providers

3 participants