Skip to content

feat(plugins): add Google image_gen backend (Imagen 4 + Gemini Flash Image)#19082

Open
ItachiDevv wants to merge 1 commit into
NousResearch:mainfrom
ItachiDevv:feat/google-image-gen-backend
Open

feat(plugins): add Google image_gen backend (Imagen 4 + Gemini Flash Image)#19082
ItachiDevv wants to merge 1 commit into
NousResearch:mainfrom
ItachiDevv:feat/google-image-gen-backend

Conversation

@ItachiDevv

Copy link
Copy Markdown

What

Adds a third image-gen backend at plugins/image_gen/google/, mirroring the existing openai/ and xai/ plugins. Six models exposed through one provider:

Model Endpoint
imagen-4.0-fast-generate-001 (default) :predict
imagen-4.0-generate-001 :predict
imagen-4.0-ultra-generate-001 :predict
gemini-2.5-flash-image :generateContent
gemini-3.1-flash-image-preview :generateContent
gemini-3-pro-image-preview :generateContent

Why

Resolves the "Google as image-gen provider" gap discussed in #13798. Closest existing options route through Composio's MCP middleware; this plugin uses a direct Gemini API key, matching the existing openai/xai plugin pattern.

Auth

GEMINI_API_KEY (preferred) or GOOGLE_API_KEY. Image-gen requires Tier 1 (paid) on the project — the plugin surfaces the upstream "paid plan required" / quota errors verbatim.

Selection precedence

  1. GOOGLE_IMAGE_MODEL env var
  2. image_gen.google.model in config.yaml
  3. DEFAULT_MODEL (imagen-4.0-fast-generate-001)

How to test

export GEMINI_API_KEY=...   # Tier 1 project required
hermes -q "generate an image of a small wooden boat at dawn"
# with image_gen.provider=google and image_gen.google.model=imagen-4.0-ultra-generate-001

Output saves to ~/.hermes/cache/images/google_<model>_<ts>_<id>.png.

Tests

24 tests in tests/plugins/image_gen/test_google_provider.py covering availability (with GEMINI_API_KEY, GOOGLE_API_KEY fallback, neither), model resolution (default, env override, unknown-fallback), both endpoint shapes (success / empty response / API error / timeout), aspect ratio mapping (landscape→16:9, portrait→9:16, square→1:1), auth correctness (key passed as params={"key": ...}, never in URL or header), and registration via register(ctx).

All 97 sibling image_gen tests still pass.

Platforms tested

Linux (Ubuntu 24.04 / WSL2). Plugin is pure-Python with requests; no OS-specific paths or process logic.

Refs: #13798

…Image)

Adds plugins/image_gen/google/ as a third image-gen backend, mirroring
the existing openai/ and xai/ plugins.

Six models are exposed through one provider:
- imagen-4.0-fast-generate-001 (default)
- imagen-4.0-generate-001
- imagen-4.0-ultra-generate-001
- gemini-2.5-flash-image
- gemini-3.1-flash-image-preview
- gemini-3-pro-image-preview

The plugin dispatches by endpoint shape:
- Imagen models → :predict
- Gemini-image models → :generateContent with responseModalities=["IMAGE"]

Auth via GEMINI_API_KEY (preferred) or GOOGLE_API_KEY. Image-gen models
require Tier 1 (paid) on the Gemini API project; the plugin surfaces the
upstream "paid plan required" / quota errors verbatim.

Selection precedence:
  1. GOOGLE_IMAGE_MODEL env var
  2. image_gen.google.model in config.yaml
  3. DEFAULT_MODEL (imagen-4.0-fast-generate-001)

24 tests added covering availability, model resolution, both endpoint
shapes (success/empty/api-error/timeout), aspect-ratio mapping, auth
(query-param not URL/header), and registration. All sibling image_gen
tests still pass (97 total).

Refs: NousResearch#13798
@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/plugins Plugin system and bundled plugins tool/vision Vision analysis and image generation labels May 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have tool/vision Vision analysis and image generation type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants