Skip to content

feat(providers): route gemini through the native AI Studio API#12656

Closed
kshitijk4poor wants to merge 3 commits into
mainfrom
feat/native-gemini-provider
Closed

feat(providers): route gemini through the native AI Studio API#12656
kshitijk4poor wants to merge 3 commits into
mainfrom
feat/native-gemini-provider

Conversation

@kshitijk4poor

Copy link
Copy Markdown
Collaborator

Summary

  • switch the built-in gemini provider from Google's OpenAI-compatible endpoint to the native AI Studio API
  • add a native Gemini adapter over models/{model}:generateContent and :streamGenerateContent?alt=sse
  • preserve Gemini-specific tool-call replay details (thoughtSignature and correct functionResponse.name) while routing auxiliary Gemini calls through the same adapter

Why

The current first-class Gemini provider still rides the OpenAI-compatible endpoint, which has been brittle for Hermes's agent/tool loop (#4983 and the auth/tool-call fallout in #7893, #12127, #12168). Other agent repos that handle Gemini more reliably keep Gemini on its native request/response shape instead of treating it as plain OpenAI chat completions.

This keeps Hermes's outer agent loop unchanged (api_mode=chat_completions) but swaps the underlying transport to a native Gemini adapter.

What changed

  • added agent/gemini_native_adapter.py
    • OpenAI-shaped facade with .chat.completions.create(...)
    • native request translation (contents, tools, toolConfig, generationConfig)
    • native response + streaming translation back into Hermes/OpenAI-shaped objects
    • Gemini API error wrapper with status_code / retry_after
  • changed the built-in gemini provider default base URL to https://generativelanguage.googleapis.com/v1beta
  • updated run_agent.py to construct GeminiNativeClient for the built-in gemini provider
  • updated agent/auxiliary_client.py so explicit/auto Gemini auxiliary clients use the same native adapter (including async wrapping)
  • updated Gemini provider tests and added adapter-focused tests

Test plan

  • bash scripts/run_tests.sh tests/agent/test_gemini_native_adapter.py tests/hermes_cli/test_gemini_provider.py -q
  • bash scripts/run_tests.sh tests/agent/test_auxiliary_client.py tests/hermes_cli/test_gemini_provider.py tests/agent/test_gemini_native_adapter.py -q
  • bash scripts/run_tests.sh tests/run_agent/test_run_agent.py tests/run_agent/test_streaming.py tests/agent/test_gemini_cloudcode.py -q -k 'tool_call_extra_content_preserved or gemini_cloudcode'
  • E2E via isolated local fake Gemini native server:
    • verified /v1beta/models/gemini-2.5-flash:streamGenerateContent?alt=sse
    • verified x-goog-api-key auth header
    • verified second turn replays functionResponse with the original tool name

- add a native Gemini adapter over generateContent/streamGenerateContent
- switch the built-in gemini provider off the OpenAI-compatible endpoint
- preserve thought signatures and native functionResponse replay
- route auxiliary Gemini clients through the same adapter
- add focused unit coverage plus native-provider integration checks
@github-actions

Copy link
Copy Markdown
Contributor

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: base64 encoding/decoding detected

Base64 has legitimate uses (images, JWT, etc.) but is also commonly used to obfuscate malicious payloads. Verify the usage is appropriate.

Matches (first 20):

191:+                raw = base64.b64decode(encoded)
198:+                        "data": base64.b64encode(raw).decode("ascii"),

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

- only use the native adapter for the canonical Gemini native endpoint
- keep custom and /openai base URLs on the OpenAI-compatible path
- preserve Hermes keepalive transport injection for native Gemini clients
- stabilize streaming tool-call replay across repeated SSE events
- add follow-up tests for base_url precedence, async streaming, and duplicate tool-call chunks
- preserve explicit Gemini base_url/api_key in auxiliary auto routing
- make native SSE parsing handle multiline data frames correctly
- add regression tests for auxiliary base_url precedence and SSE parsing
- document native Gemini defaults and explicit OpenAI-compatible overrides
@teknium1

Copy link
Copy Markdown
Contributor

Merged via #12674 — your commits are on main with authorship preserved (3dea497, d393104). Salvage note: folded proxy-env support into the refactored _build_keepalive_http_client() so the new Gemini-native client honors HTTPS_PROXY. Thanks @kshitijk4poor!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants