Skip to content

fix(onboard): classify expired API key as credential error (fixes #1942)#1944

Closed
kagura-agent wants to merge 1 commit into
NVIDIA:mainfrom
kagura-agent:fix-gemini-expired-key-1942
Closed

fix(onboard): classify expired API key as credential error (fixes #1942)#1944
kagura-agent wants to merge 1 commit into
NVIDIA:mainfrom
kagura-agent:fix-gemini-expired-key-1942

Conversation

@kagura-agent

@kagura-agent kagura-agent commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

Problem

When a user enters an expired Google Gemini API key during onboard, the wizard loops back to provider selection without offering to re-enter the key (#1942).

Root cause: classifyValidationFailure() checks HTTP status 400 → kind: "model" before checking the response message. Gemini returns HTTP 400 with "API key expired. Please renew the API key.", so the credential message regex never gets a chance to match.

In the validation flow for Gemini (which uses validateOpenAiLikeSelection without allowModelRetry), kind: "model" falls through to { kind: "unknown", retry: "selection" } in getProbeRecovery(), skipping the key re-entry prompt.

Fix

  1. Move the credential message regex check before the HTTP 400 status check in classifyValidationFailure()
  2. Add api key expired to the credential regex pattern

This ensures expired/invalid API keys are classified as kind: "credential" regardless of HTTP status code, so promptValidationRecovery() correctly offers retry/back/exit.

Changes

  • src/lib/validation.ts — reorder checks: credential message regex before HTTP 400
  • src/lib/validation.test.ts — add tests for HTTP 400 + expired key message, and HTTP 400 + unrelated message
  • src/lib/validation-recovery.test.ts — add probe-level test for expired key recovery

Testing

All 45 validation + recovery tests pass. The 5 pre-existing preflight failures (detecting running gateway) are unrelated.

Summary by CodeRabbit

  • Bug Fixes
    • Improved error classification to recognize expired API key responses as credential failures, enabling proper credential recovery handling.

…DIA#1942)

Gemini returns HTTP 400 with 'API key expired' for expired keys.
classifyValidationFailure() checked HTTP 400 (model) before the
credential message regex, so expired keys were never classified
as credential failures — the user was looped back to provider
selection without a chance to re-enter the key.

Fix: move the credential message regex check before the HTTP 400
status check, and add 'api key expired' to the pattern. This
ensures promptValidationRecovery offers retry/back/exit when the
API key is expired or invalid.
@coderabbitai

coderabbitai Bot commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 3c0cabc4-d008-4fb4-9468-ac0b7c7319bb

📥 Commits

Reviewing files that changed from the base of the PR and between bb5fdb4 and 51a0342.

📒 Files selected for processing (3)
  • src/lib/validation-recovery.test.ts
  • src/lib/validation.test.ts
  • src/lib/validation.ts

📝 Walkthrough

Walkthrough

Changes extend error classification to recognize "API key expired" as a credential-related failure. The validation logic now detects this error message pattern, and corresponding test cases validate that expired API key responses are properly classified as credential-recovery failures.

Changes

Cohort / File(s) Summary
Test Cases
src/lib/validation-recovery.test.ts, src/lib/validation.test.ts
Added new test cases verifying that HTTP 400 responses with "API key expired" messages classify as credential failures; additional test ensures unrelated 400 errors are classified differently.
Validation Logic
src/lib/validation.ts
Extended regex pattern in classifyValidationFailure() to recognize "api key expired" messages alongside existing credential-error keywords, enabling proper classification of expired API key responses as credential-recovery failures.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A key grows old, its time expires,
but now our code knows what transpires—
expired credentials find their way,
to recovery's path without delay! 🔑✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and specifically summarizes the main change: fixing the classification of expired API key errors as credential errors in the onboarding context.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@wscurran wscurran added bug Something fails against expected or documented behavior Getting Started labels Apr 16, 2026
@wscurran

Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this pull request, which proposes a way to fix the issue with expired API keys not being properly handled during onboarding. This change could help improve the user experience by providing clearer error messages and recovery options.


Possibly related open issues:

@kagura-agent

Copy link
Copy Markdown
Contributor Author

Closing to reduce PR volume. Apologies.

latenighthackathon added a commit to latenighthackathon/NemoClaw that referenced this pull request Apr 22, 2026
…IDIA#1942)

When a Gemini onboard validates with an expired API key, the provider
returns HTTP 400 with the message "API key expired. Please renew the
API key.". The current classifier sees httpStatus === 400 first and
returns { kind: "model", retry: "model" }, so the expired-key case
never reaches the credential-message regex. In the Gemini validation
flow which does not allowModelRetry, kind: "model" falls through
to unknown, and the onboard wizard loops back to provider selection
without offering to re-enter the key.

Fix: check credential-bearing error messages BEFORE the HTTP 400
model default, and extend the credential regex to include 'api key
expired' and API_KEY_INVALID forms so Gemini's API_KEY_INVALID status
classification also lands as credential.

Preserves existing behavior: HTTP 400 without a credential-bearing
message still classifies as model (regression guard test added).

Tests
- src/lib/validation.test.ts: 41 tests pass, +3 for the new paths
  (expired key, API_KEY_INVALID, regression guard)
- src/lib/validation-recovery.test.ts: 5 tests pass

Prior fix attempt (NVIDIA#1944) was closed by its author on 2026-04-18 to
reduce PR volume. This patch replays the same core reorder with a
slightly broader credential regex and preserves the existing test
style.

Closes NVIDIA#1942

Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
latenighthackathon added a commit to latenighthackathon/NemoClaw that referenced this pull request Apr 22, 2026
…IDIA#1942)

When a Gemini onboard validates with an expired API key, the provider
returns HTTP 400 with the message "API key expired. Please renew the
API key.". The current classifier sees httpStatus === 400 first and
returns { kind: "model", retry: "model" }, so the expired-key case
never reaches the credential-message regex. In the Gemini validation
flow which does not allowModelRetry, kind: "model" falls through
to unknown, and the onboard wizard loops back to provider selection
without offering to re-enter the key.

Fix: check credential-bearing error messages BEFORE the HTTP 400
model default, and extend the credential regex to include 'api key
expired' and API_KEY_INVALID forms so Gemini's API_KEY_INVALID status
classification also lands as credential.

Preserves existing behavior: HTTP 400 without a credential-bearing
message still classifies as model (regression guard test added).

Tests
- src/lib/validation.test.ts: 41 tests pass, +3 for the new paths
  (expired key, API_KEY_INVALID, regression guard)
- src/lib/validation-recovery.test.ts: 5 tests pass

Prior fix attempt (NVIDIA#1944) was closed by its author on 2026-04-18 to
reduce PR volume. This patch replays the same core reorder with a
slightly broader credential regex and preserves the existing test
style.

Closes NVIDIA#1942

Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
ericksoa pushed a commit that referenced this pull request Apr 23, 2026
## Summary

When a Gemini onboard validates with an expired API key, the provider
returns HTTP 400 with `"API key expired. Please renew the API key."`.
The current classifier checks `httpStatus === 400` first and returns `{
kind: "model", retry: "model" }`, so the expired-key case never reaches
the credential-message regex. In the Gemini validation flow (no
`allowModelRetry`), `kind: "model"` falls through to `unknown`, and the
onboard wizard loops back to provider selection without offering to
re-enter the key.

## Related Issue

Closes #1942

## Changes

- `src/lib/validation.ts` — move the credential-bearing message check
**before** the `httpStatus === 400 → model` default. Extend the
credential regex to include `api key expired` and `api[_ ]key[_
]invalid` so Gemini's `API_KEY_INVALID` status classification also lands
as `credential`.
- Inline comment documents why the order matters and cites #1942.
- HTTP 400 without a credential-bearing message still classifies as
`model` — regression guard test added.

## Testing

- [x] `src/lib/validation.test.ts` — 41 tests pass (+3 new: expired key,
API_KEY_INVALID, regression guard)
- [x] `src/lib/validation-recovery.test.ts` — 5 tests pass
- [x] Build + typecheck clean

Executed:
- Targeted `npx vitest run src/lib/validation.test.ts
src/lib/validation-recovery.test.ts` in the `nemoclaw-test` Docker
environment

## Prior art

Prior fix attempt [#1944](#1944)
was closed by its author on 2026-04-18 to reduce PR volume — the issue
remained open. This PR replays the same core reorder with a slightly
broader credential regex (covering both `API key expired` and
`API_KEY_INVALID` forms Gemini uses) and preserves the existing test
style.

## Checklist

- [x] Follows [Conventional
Commits](https://www.conventionalcommits.org/)
- [x] Commits are signed (SSH)
- [x] DCO Signed-off-by trailer present

Signed-off-by: latenighthackathon
<latenighthackathon@users.noreply.github.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Error classification now prioritizes credential-related messages so
API key/credential failures (expired, invalid, unauthorized) are
reported as credential issues rather than being misclassified as model
errors, improving clarity and retry guidance.

* **Tests**
* Added tests covering credential-related 400 responses and a regression
case to ensure unrelated model errors remain classified correctly.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
Co-authored-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
@wscurran wscurran added area: install Install, setup, prerequisites, or uninstall flow area: onboarding Onboarding FSM, provider setup, sandbox launch, or first-run flow bug-fix PR fixes a bug or regression and removed Getting Started bug Something fails against expected or documented behavior labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: install Install, setup, prerequisites, or uninstall flow area: onboarding Onboarding FSM, provider setup, sandbox launch, or first-run flow bug-fix PR fixes a bug or regression

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants