Skip to content

feat: add Gemini direct API backend with OAuth and CLI fallback#131

Merged
bbsngg merged 4 commits intoOpenLAIR:mainfrom
liuyixin-louis:feat/gemini-direct-api
Apr 9, 2026
Merged

feat: add Gemini direct API backend with OAuth and CLI fallback#131
bbsngg merged 4 commits intoOpenLAIR:mainfrom
liuyixin-louis:feat/gemini-direct-api

Conversation

@liuyixin-louis
Copy link
Copy Markdown
Collaborator

Summary

This PR replaces the Gemini subprocess-only path with a direct API backend that works for both API-key and OAuth-only users.

What changed

  • added a new direct Gemini provider in server/gemini-api.js
  • added Gemini OAuth token/header resolution in server/utils/geminiOAuth.js
  • wired Gemini WebSocket and agent routes to prefer direct API and fall back to CLI when needed
  • implemented Gemini agent loop support for streaming, tool calls, tool execution, and tool-result replay
  • added OAuth Code Assist compatibility, including thoughtSignature handling
  • added retry/backoff for transient Gemini failures
  • added best-effort fallback from exhausted direct API retries into Gemini CLI by bridging session history into Gemini CLI chat storage
  • added direct model fallback for Gemini capacity exhaustion before invoking the CLI
  • switched the default Gemini chat model to gemini-2.5-flash for better runtime stability
  • cleaned the branch by removing the temporary manual smoke-test script before opening the PR
  • added regression coverage for auth selection, OAuth tool loop, session indexing, retry exhaustion, model fallback, and CLI fallback behavior

Notes

  • API key users go through the public Gemini REST endpoint
  • OAuth-only users go through the Code Assist direct API path
  • retryable failures now attempt direct model fallback first, then a best-effort CLI fallback instead of immediately surfacing an error
  • the CLI fallback is session-preserving on a best-effort basis, but it is not a byte-for-byte continuation of an in-flight HTTP turn

Test Plan

  • fnm exec --using=22 node --check server/gemini-api.js
  • fnm exec --using=22 npm test -- server/__tests__/gemini-api.test.mjs server/__tests__/gemini-session-index.test.mjs server/__tests__/gemini-legacy-thought.test.mjs server/__tests__/session-delete.test.mjs

Copy link
Copy Markdown
Collaborator

@Zhang-Henry Zhang-Henry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Large PR (~2800 lines added), introducing a full Gemini direct API backend. Overall architecture is clean and test coverage is solid, but there are several issues that should be addressed before merging.


Blocking Issues

1. Undefined model variable in fallback path
server/gemini-api.js (~L2227): In the Code Assist bootstrap error handling block, the fallback call passes model but only requestedModel and activeModel are defined in scope. This will throw a ReferenceError at runtime.

2. contents.push inside multi-tool loop corrupts conversation history
server/gemini-api.js (~L2519): When a model turn returns multiple function calls, contents.push for the model turn happens inside the for loop. After the first iteration, pendingFunctionCalls.splice(...) is already empty, so subsequent model turns will have empty parts. All function responses should be collected and pushed once after the loop completes.

3. Shell injection in glob and grep_search tools
server/gemini-api.js (~L1143-1170): The target path from resolveToolPath is interpolated directly into shell command strings without escaping. If the path contains single quotes (e.g., a user's home dir like O'Brien/project), this breaks out of the quoting. Should use execFile with an argument array instead of string interpolation to avoid shell parsing entirely.

4. SSRF vulnerability in web_fetch
server/gemini-api.js (~L1176): No URL validation — the AI can request internal network addresses (http://169.254.169.254/, http://localhost:3000/admin, etc.). At minimum, reject private IP ranges and loopback addresses. Also note that web_fetch is in READ_ONLY_TOOLS, so it executes without user confirmation in plan mode despite being able to leak internal info.


Should Fix

5. Auth priority inconsistency
queryGeminiApi (~L2200) checks env.GEMINI_API_KEY first (API key priority), but getGeminiAuthHeaders in geminiOAuth.js (~L3013) tries OAuth first. These two layers disagree on precedence, making behavior unpredictable depending on where credentials are set.

6. codeAssistProjectCache never expires
server/gemini-api.js (~L847): The global Map has no TTL. If a user switches accounts, the cached Project ID from the old account persists until process restart.

7. Fragile isError detection
server/gemini-api.js (~L2495): output.startsWith('Error') will false-positive on normal content that begins with "Error" (e.g., reading a file named Error.md). Consider using a structured { success, output } return instead of string prefix conventions.

8. write_todos in READ_ONLY_TOOLS
server/gemini-api.js (~L1024): Semantically a write operation — shouldn't be allowed without confirmation in plan mode.

9. findInTree unbounded traversal
server/utils/geminiOAuth.js (~L3061): Depth-10 recursive readdirSync can block the event loop for seconds in large directories like node_modules. Consider constraining the search scope.


Minor

  • read_many_files (~L1098): Missing per-file error handling — exceptions bubble up and lose info about which file failed.
  • refreshAccessToken: After writing tokens, Date.now() is used as mtime cache, but next loadGeminiOAuthTokens reads the real mtime, causing immediate cache invalidation.
  • sessionsDir() runs fs.mkdir on every call — could initialize once.

@bbsngg
Copy link
Copy Markdown
Contributor

bbsngg commented Apr 8, 2026

@liuyixin-louis

@liuyixin-louis
Copy link
Copy Markdown
Collaborator Author

Response to Code Review

Thanks @Zhang-Henry for the thorough review — all issues were valid. Here's what was addressed in commit acbc370.


Blocking Issues (all resolved)

1. Undefined model variable in fallback path
Fixed: changed modelactiveModel in the Code Assist bootstrap error handler.

2. contents.push inside multi-tool loop corrupts conversation history
Fixed: restructured the loop to collect all function call metadata and all function responses separately, then push one model turn and one user turn after the loop completes. This produces the correct Gemini API conversation format for multi-tool responses.

3. Shell injection in glob and grep_search
Fixed: replaced execAsync (string interpolation into shell) with execFileAsync (argument array, no shell parsing). Both glob and grep_search now use execFile('rg', [...args]) directly.

4. SSRF vulnerability in web_fetch
Fixed: added URL validation that blocks localhost, 127.0.0.1, ::1, 0.0.0.0, .local, 169.254.*, 10.*, 192.168.*, 172.16-31.*, and file: protocol. Also set redirect: 'manual' to prevent redirect-based bypass. Removed web_fetch from READ_ONLY_TOOLS so it now requires user confirmation in plan mode.


Should Fix (all resolved)

5. Auth priority inconsistency
Fixed: getGeminiAuthHeaders() now checks API key first, then falls back to OAuth — matching queryGeminiApi's env-var precedence.

6. codeAssistProjectCache never expires
Fixed: cache entries now include a timestamp and are evicted after 30 minutes (CODE_ASSIST_CACHE_TTL_MS).

7. Fragile isError detection
Fixed: added typeof output === 'string' guard before startsWith('Error') check.

8. write_todos in READ_ONLY_TOOLS
Fixed: removed write_todos from READ_ONLY_TOOLS. It's a write operation and should require permission in plan mode.


Additional fixes beyond the review

These were found via a secondary automated scan and addressed in the same commit:

  • Path traversal in tool executor: resolveToolPath() now rejects any resolved path that escapes the working directory. Previously, absolute paths or ../ sequences could read/write arbitrary files on the server.
  • Session ID path traversal: sessionFilePath() now sanitizes session IDs, stripping /, \, and .. sequences to prevent directory escape in session storage.
  • OAuth refresh_token-only state: getValidAccessToken() now attempts a token refresh when access_token is missing but refresh_token exists, instead of returning null.
  • Stage tags using wrong session ID: applyStageTagsToSession was called with sessionId (the option) instead of currentSessionId (the actual ID), causing newly generated sessions to miss tags.
  • Phantom sessions on 401/403: When auth fails mid-session (after session-created was sent), we now send gemini-error + gemini-complete before returning, so the UI properly closes the session.
  • read_many_files per-file error handling: Individual file read failures now return an error message for that file instead of failing the entire batch.

All 51 existing tests pass.

liuyixin-louis and others added 3 commits April 8, 2026 21:18
…rect API

- Fix undefined `model` variable in Code Assist bootstrap fallback (use `activeModel`)
- Fix multi-tool loop: collect all function calls/responses, push to contents once
  instead of per-iteration (prevents corrupted conversation history)
- Fix shell injection in glob/grep_search: use execFile with argument arrays
  instead of string interpolation into shell commands
- Fix SSRF in web_fetch: block private/internal network addresses and loopback;
  remove web_fetch and write_todos from READ_ONLY_TOOLS (both had wrong semantics)
- Fix auth priority inconsistency: getGeminiAuthHeaders now checks API key first,
  matching queryGeminiApi's env-var precedence
- Add TTL (30m) to codeAssistProjectCache to prevent stale entries on account switch
- Add path sandbox to resolveToolPath: reject paths that escape the working directory
- Sanitize sessionId to prevent path traversal in session file storage
- Fix getValidAccessToken to attempt refresh when only refresh_token exists
- Fix stage tags using original sessionId instead of currentSessionId
- Send gemini-error + gemini-complete on 401/403 after session-created (prevents
  phantom sessions in UI)
- Add per-file error handling in read_many_files
- Add typeof guard to isError string check

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bbsngg bbsngg force-pushed the feat/gemini-direct-api branch from acbc370 to 5d63ddd Compare April 9, 2026 01:21
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bbsngg bbsngg merged commit 6b67730 into OpenLAIR:main Apr 9, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants