feat: add reference_image_path support to image_generate tool

Feature Request: Reference Image Support in image_generate Tool
Summary
The image_generate tool currently exposes only prompt and aspect_ratio. Models like GPT Image 2 (via the openai-codex backend) support reference image inputs for style transfer, subject likeness, and composition guidance, but there is no way to pass an image through the tool schema.
Current Behavior
image_generate(prompt, aspect_ratio)
The tool generates images from text only. When a user sends a reference image in the conversation and asks for a generation based on it, the agent can only describe the image in the prompt. The actual pixels are never passed to the image generation API.
Proposed Behavior
image_generate(prompt, aspect_ratio, reference_image_path?)
When reference_image_path is provided (absolute path to a local file), the backend includes it as a multimodal content part in the API call. For the openai-codex provider, this means adding an input_image part to the Codex Responses API input alongside the text prompt.
Backends that do not support reference images should silently ignore the parameter (existing behavior preserved).
Motivation
GPT 5.5 users on the openai-codex provider already have native image generation through the Codex Responses API. The missing piece is passing reference images for:

Subject likeness preservation (e.g., generating variations of a person)
Style transfer (matching a visual style from an example)
Composition guidance (using a layout reference)

This is especially relevant for users generating character-consistent images across sessions, where trait-based prompting alone produces inconsistent results.
Implementation Scope
Three files need changes:

tools/image_generation_tool.py

Add reference_image_path (optional string) to IMAGE_GENERATE_SCHEMA
Pass it through _handle_image_generate and _dispatch_to_plugin_provider


plugins/image_gen/openai-codex/__init__.py

Accept reference_image_path via **kwargs in generate()
In _collect_image_b64(), build multimodal input content with an input_image part when a reference is provided
Base64-encode the local file and include it as a data URL


agent/image_gen_provider.py (no change needed)

The generate() ABC already accepts **kwargs, so reference_image_path passes through without a signature change



Notes

The openai plugin (API key variant) could also benefit from this via the images.edit endpoint, but that is a separate implementation path.
FAL models that support image-to-image (e.g., img2img workflows) could also consume this parameter in future.
The parameter name reference_image_path was chosen over image or input_image to be explicit about its role (reference/guidance, not inpainting or editing).

Environment

Hermes Agent (latest main)
Provider: openai-codex
Model: gpt-5.5 (LLM) + gpt-image-2 (image gen)
Gateway: Telegram + WhatsApp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add reference_image_path support to image_generate tool #25661

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat: add reference_image_path support to image_generate tool #25661

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions