feat: add Gemini direct API backend with OAuth and CLI fallback by liuyixin-louis · Pull Request #131 · OpenLAIR/dr-claw

liuyixin-louis · 2026-04-06T00:38:39Z

Summary

This PR replaces the Gemini subprocess-only path with a direct API backend that works for both API-key and OAuth-only users.

What changed

added a new direct Gemini provider in server/gemini-api.js
added Gemini OAuth token/header resolution in server/utils/geminiOAuth.js
wired Gemini WebSocket and agent routes to prefer direct API and fall back to CLI when needed
implemented Gemini agent loop support for streaming, tool calls, tool execution, and tool-result replay
added OAuth Code Assist compatibility, including thoughtSignature handling
added retry/backoff for transient Gemini failures
added best-effort fallback from exhausted direct API retries into Gemini CLI by bridging session history into Gemini CLI chat storage
added direct model fallback for Gemini capacity exhaustion before invoking the CLI
switched the default Gemini chat model to gemini-2.5-flash for better runtime stability
cleaned the branch by removing the temporary manual smoke-test script before opening the PR
added regression coverage for auth selection, OAuth tool loop, session indexing, retry exhaustion, model fallback, and CLI fallback behavior

Notes

API key users go through the public Gemini REST endpoint
OAuth-only users go through the Code Assist direct API path
retryable failures now attempt direct model fallback first, then a best-effort CLI fallback instead of immediately surfacing an error
the CLI fallback is session-preserving on a best-effort basis, but it is not a byte-for-byte continuation of an in-flight HTTP turn

Test Plan

fnm exec --using=22 node --check server/gemini-api.js
fnm exec --using=22 npm test -- server/__tests__/gemini-api.test.mjs server/__tests__/gemini-session-index.test.mjs server/__tests__/gemini-legacy-thought.test.mjs server/__tests__/session-delete.test.mjs

Zhang-Henry

Code Review

Large PR (~2800 lines added), introducing a full Gemini direct API backend. Overall architecture is clean and test coverage is solid, but there are several issues that should be addressed before merging.

Blocking Issues

1. Undefined model variable in fallback path
server/gemini-api.js (~L2227): In the Code Assist bootstrap error handling block, the fallback call passes model but only requestedModel and activeModel are defined in scope. This will throw a ReferenceError at runtime.

2. contents.push inside multi-tool loop corrupts conversation history
server/gemini-api.js (~L2519): When a model turn returns multiple function calls, contents.push for the model turn happens inside the for loop. After the first iteration, pendingFunctionCalls.splice(...) is already empty, so subsequent model turns will have empty parts. All function responses should be collected and pushed once after the loop completes.

3. Shell injection in glob and grep_search tools
server/gemini-api.js (~L1143-1170): The target path from resolveToolPath is interpolated directly into shell command strings without escaping. If the path contains single quotes (e.g., a user's home dir like O'Brien/project), this breaks out of the quoting. Should use execFile with an argument array instead of string interpolation to avoid shell parsing entirely.

4. SSRF vulnerability in web_fetch
server/gemini-api.js (~L1176): No URL validation — the AI can request internal network addresses (http://169.254.169.254/, http://localhost:3000/admin, etc.). At minimum, reject private IP ranges and loopback addresses. Also note that web_fetch is in READ_ONLY_TOOLS, so it executes without user confirmation in plan mode despite being able to leak internal info.

Should Fix

5. Auth priority inconsistency
queryGeminiApi (~L2200) checks env.GEMINI_API_KEY first (API key priority), but getGeminiAuthHeaders in geminiOAuth.js (~L3013) tries OAuth first. These two layers disagree on precedence, making behavior unpredictable depending on where credentials are set.

6. codeAssistProjectCache never expires
server/gemini-api.js (~L847): The global Map has no TTL. If a user switches accounts, the cached Project ID from the old account persists until process restart.

7. Fragile isError detection
server/gemini-api.js (~L2495): output.startsWith('Error') will false-positive on normal content that begins with "Error" (e.g., reading a file named Error.md). Consider using a structured { success, output } return instead of string prefix conventions.

8. write_todos in READ_ONLY_TOOLS
server/gemini-api.js (~L1024): Semantically a write operation — shouldn't be allowed without confirmation in plan mode.

9. findInTree unbounded traversal
server/utils/geminiOAuth.js (~L3061): Depth-10 recursive readdirSync can block the event loop for seconds in large directories like node_modules. Consider constraining the search scope.

Minor

read_many_files (~L1098): Missing per-file error handling — exceptions bubble up and lose info about which file failed.
refreshAccessToken: After writing tokens, Date.now() is used as mtime cache, but next loadGeminiOAuthTokens reads the real mtime, causing immediate cache invalidation.
sessionsDir() runs fs.mkdir on every call — could initialize once.

bbsngg · 2026-04-08T18:13:01Z

@liuyixin-louis

liuyixin-louis · 2026-04-08T20:25:07Z

Response to Code Review

Thanks @Zhang-Henry for the thorough review — all issues were valid. Here's what was addressed in commit acbc370.

Blocking Issues (all resolved)

1. Undefined model variable in fallback path
Fixed: changed model → activeModel in the Code Assist bootstrap error handler.

2. contents.push inside multi-tool loop corrupts conversation history
Fixed: restructured the loop to collect all function call metadata and all function responses separately, then push one model turn and one user turn after the loop completes. This produces the correct Gemini API conversation format for multi-tool responses.

3. Shell injection in glob and grep_search
Fixed: replaced execAsync (string interpolation into shell) with execFileAsync (argument array, no shell parsing). Both glob and grep_search now use execFile('rg', [...args]) directly.

4. SSRF vulnerability in web_fetch
Fixed: added URL validation that blocks localhost, 127.0.0.1, ::1, 0.0.0.0, .local, 169.254.*, 10.*, 192.168.*, 172.16-31.*, and file: protocol. Also set redirect: 'manual' to prevent redirect-based bypass. Removed web_fetch from READ_ONLY_TOOLS so it now requires user confirmation in plan mode.

Should Fix (all resolved)

5. Auth priority inconsistency
Fixed: getGeminiAuthHeaders() now checks API key first, then falls back to OAuth — matching queryGeminiApi's env-var precedence.

6. codeAssistProjectCache never expires
Fixed: cache entries now include a timestamp and are evicted after 30 minutes (CODE_ASSIST_CACHE_TTL_MS).

7. Fragile isError detection
Fixed: added typeof output === 'string' guard before startsWith('Error') check.

8. write_todos in READ_ONLY_TOOLS
Fixed: removed write_todos from READ_ONLY_TOOLS. It's a write operation and should require permission in plan mode.

Additional fixes beyond the review

These were found via a secondary automated scan and addressed in the same commit:

Path traversal in tool executor: resolveToolPath() now rejects any resolved path that escapes the working directory. Previously, absolute paths or ../ sequences could read/write arbitrary files on the server.
Session ID path traversal: sessionFilePath() now sanitizes session IDs, stripping /, \, and .. sequences to prevent directory escape in session storage.
OAuth refresh_token-only state: getValidAccessToken() now attempts a token refresh when access_token is missing but refresh_token exists, instead of returning null.
Stage tags using wrong session ID: applyStageTagsToSession was called with sessionId (the option) instead of currentSessionId (the actual ID), causing newly generated sessions to miss tags.
Phantom sessions on 401/403: When auth fails mid-session (after session-created was sent), we now send gemini-error + gemini-complete before returning, so the UI properly closes the session.
read_many_files per-file error handling: Individual file read failures now return an error message for that file instead of failing the entire batch.

All 51 existing tests pass.

…rect API - Fix undefined `model` variable in Code Assist bootstrap fallback (use `activeModel`) - Fix multi-tool loop: collect all function calls/responses, push to contents once instead of per-iteration (prevents corrupted conversation history) - Fix shell injection in glob/grep_search: use execFile with argument arrays instead of string interpolation into shell commands - Fix SSRF in web_fetch: block private/internal network addresses and loopback; remove web_fetch and write_todos from READ_ONLY_TOOLS (both had wrong semantics) - Fix auth priority inconsistency: getGeminiAuthHeaders now checks API key first, matching queryGeminiApi's env-var precedence - Add TTL (30m) to codeAssistProjectCache to prevent stale entries on account switch - Add path sandbox to resolveToolPath: reject paths that escape the working directory - Sanitize sessionId to prevent path traversal in session file storage - Fix getValidAccessToken to attempt refresh when only refresh_token exists - Fix stage tags using original sessionId instead of currentSessionId - Send gemini-error + gemini-complete on 401/403 after session-created (prevents phantom sessions in UI) - Add per-file error handling in read_many_files - Add typeof guard to isError string check Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Zhang-Henry reviewed Apr 6, 2026

View reviewed changes

liuyixin-louis and others added 3 commits April 8, 2026 21:18

feat: add Gemini direct API backend with OAuth and CLI fallback

6cb1651

fix: stabilize gemini direct fallback behavior

c61e8f3

bbsngg force-pushed the feat/gemini-direct-api branch from acbc370 to 5d63ddd Compare April 9, 2026 01:21

fix: add missing imports for AutoResearchProvider and model constants

6594360

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

bbsngg merged commit 6b67730 into OpenLAIR:main Apr 9, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Gemini direct API backend with OAuth and CLI fallback#131

feat: add Gemini direct API backend with OAuth and CLI fallback#131
bbsngg merged 4 commits intoOpenLAIR:mainfrom
liuyixin-louis:feat/gemini-direct-api

liuyixin-louis commented Apr 6, 2026

Uh oh!

Zhang-Henry left a comment

Uh oh!

bbsngg commented Apr 8, 2026

Uh oh!

liuyixin-louis commented Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

liuyixin-louis commented Apr 6, 2026

Summary

What changed

Notes

Test Plan

Uh oh!

Zhang-Henry left a comment

Choose a reason for hiding this comment

Code Review

Blocking Issues

Should Fix

Minor

Uh oh!

bbsngg commented Apr 8, 2026

Uh oh!

liuyixin-louis commented Apr 8, 2026

Response to Code Review

Blocking Issues (all resolved)

Should Fix (all resolved)

Additional fixes beyond the review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants