fix(image-gen): forward reference images to providers#18805
Closed
Kavyrocom wants to merge 1 commit into
Closed
Conversation
Expose reference_images on the image_generate tool schema and pass them through plugin dispatch so multimodal image providers can receive input images. Add OpenAI Codex image provider support for remote, data URL, and validated local image references. Local files must be absolute paths, actual supported images, and within bounded size/count limits. Add regression coverage for dispatch forwarding, remote references, local data URL conversion, and invalid local references.
Collaborator
|
Related to #15308 — both address reference image forwarding to plugin providers. This PR appears to be the bugfix completion of that feature. |
1 similar comment
Collaborator
|
Related to #15308 — both address reference image forwarding to plugin providers. This PR appears to be the bugfix completion of that feature. |
This was referenced May 7, 2026
This was referenced May 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes
image_generatereference image routing for plugin-backed image providers, with concrete support for the OpenAI Codex / GPT Image 2 provider.Before this change, the tool schema could expose image references, but plugin dispatch did not reliably pass those references into the configured image provider. In practice, providers such as OpenAI Codex / GPT Image 2 received only the text prompt and aspect ratio, so multimodal image generation/editing requests could not use attached/reference images.
This patch:
reference_imagesto theimage_generatetool schema.reference_imagesin the tool handler.reference_imagesthrough plugin provider dispatch.data:image/...;base64,...image URLs.png,jpeg,webp,gif);invalid_argumenterrors for invalid inputs.Problem
Users can configure image generation through different backends, including FAL-backed models and OpenAI/Codex-backed GPT Image 2. Text-only image generation worked, but image input/reference workflows were incomplete for the plugin path: references could be present at the tool boundary but not delivered to the provider.
That meant requests such as “generate an image using these attached logos/references” degraded into prompt-only generation, causing models to approximate or hallucinate logos instead of using the provided image inputs.
Implementation notes
Tool layer
tools/image_generation_tool.pynow exposes and normalizesreference_images, then forwards them to the selected plugin provider:The base
ImageGenProvider.generate()contract already accepts**kwargs, so this remains forward-compatible for providers that ignore unknown keys.OpenAI Codex / GPT Image 2 provider
plugins/image_gen/openai-codex/__init__.pybuilds multimodal Responses input content:[ {"type": "input_text", "text": prompt}, {"type": "input_image", "image_url": image_url}, ]Local images are read only after validation and encoded as
data:image/...;base64,....Validation
Targeted test run:
./venv/bin/python -m pytest \ tests/plugins/image_gen/test_openai_codex_provider.py \ tests/plugins/image_gen/test_openai_provider.py \ tests/tools/test_image_generation_plugin_dispatch.py \ tests/hermes_cli/test_image_gen_picker.py \ -q -o 'addopts='Result:
Additional checks:
Static scan of added lines found no hardcoded credentials, shell execution,
eval/exec, pickle deserialization, or SQL formatting patterns.Privacy / safety
This PR intentionally avoids including any local deployment paths, credentials, generated user assets, or installation-specific configuration.
Local reference image handling is constrained to reduce accidental file exfiltration: