Problem
Hermes's image_gen toolset currently only supports FAL.ai (FLUX 2 Pro). Users who access image generation through OpenAI-compatible gateways — OpenAI direct, tu-zi, OpenRouter, self-hosted proxies — cannot plug in their existing endpoint.
Proposal
Add a third provider "OpenAI Compatible" to the image_gen providers list in hermes_cli/tools_config.py. Configured via three env vars:
OPENAI_IMAGE_API_KEY
OPENAI_IMAGE_BASE_URL (default: https://api.openai.com/v1)
OPENAI_IMAGE_MODEL (default: gpt-image-1)
Backend uses the official openai SDK's client.images.generate() call, which works with any service exposing the OpenAI image API.
Registry strategy
Reuse name="image_generate" and dispatch by env var precedence: FAL_KEY > OPENAI_IMAGE_API_KEY. No user-facing tool name change, keeps the existing agent prompt untouched.
Scope (this PR)
generate endpoint only (no edit/variations in this PR)
- Parameter mapping:
aspect_ratio → OpenAI size preset (landscape→1536×1024, square→1024×1024, portrait→1024×1536)
- Output format matches existing FAL tool:
{"success": bool, "image": url}
Out of scope
image edit, variations, masking, base64 return, auto-upscale, safety/moderation knobs.
Motivation
- Community demand for bring-your-own-endpoint image generation, especially in regions where direct OpenAI access is restricted
- Parallels how many Hermes LLM flows already accept OpenAI-compatible
base_url
- Complements the recent GPT Image 2.0 integration — extends the pattern to any compatible backend
Happy to submit a PR if this direction is welcome. Would appreciate a quick thumbs-up or any architectural preference (e.g., abstraction layer requirements) before I start coding.
Problem
Hermes's
image_gentoolset currently only supports FAL.ai (FLUX 2 Pro). Users who access image generation through OpenAI-compatible gateways — OpenAI direct, tu-zi, OpenRouter, self-hosted proxies — cannot plug in their existing endpoint.Proposal
Add a third provider "OpenAI Compatible" to the image_gen providers list in
hermes_cli/tools_config.py. Configured via three env vars:OPENAI_IMAGE_API_KEYOPENAI_IMAGE_BASE_URL(default:https://api.openai.com/v1)OPENAI_IMAGE_MODEL(default:gpt-image-1)Backend uses the official
openaiSDK'sclient.images.generate()call, which works with any service exposing the OpenAI image API.Registry strategy
Reuse
name="image_generate"and dispatch by env var precedence:FAL_KEY > OPENAI_IMAGE_API_KEY. No user-facing tool name change, keeps the existing agent prompt untouched.Scope (this PR)
generateendpoint only (noedit/variationsin this PR)aspect_ratio→ OpenAIsizepreset (landscape→1536×1024, square→1024×1024, portrait→1024×1536){"success": bool, "image": url}Out of scope
image edit, variations, masking, base64 return, auto-upscale, safety/moderation knobs.
Motivation
base_urlHappy to submit a PR if this direction is welcome. Would appreciate a quick thumbs-up or any architectural preference (e.g., abstraction layer requirements) before I start coding.