feat(image_gen): multi-reference input support, starting with openai-codex#15308
Open
dejay2 wants to merge 1 commit into
Open
feat(image_gen): multi-reference input support, starting with openai-codex#15308dejay2 wants to merge 1 commit into
dejay2 wants to merge 1 commit into
Conversation
…codex
Adds an opt-in capability contract so providers can accept reference images.
The agent-facing `image_generate` tool schema grows an optional `references`
field (list of local file paths); the dispatcher forwards it only when the
active provider advertises `supports_references=True`, otherwise rejects
early with `references_unsupported` instead of silently dropping it.
The openai-codex provider opts in and honours references by attaching each
file as an `input_image` content item on the user message, labelled in the
prompt ("Reference image 1 is provided below.") per OpenAI's best-practice
guidance for gpt-image-2 composition. Up to MAX_REFERENCES (16) are sent,
matching the documented upstream limit.
Empirically verified against the live Codex backend: two references combine
correctly (subject + subject → merged scene), and a style-from-one-reference
+ subject-from-another composition preserves both inputs. This contradicts
the assumption baked into the existing baoyu-comic skill (which documents
`image_generate` as prompt-only) — that guidance is now accurate only when
the configured provider doesn't support references.
- agent/image_gen_provider.py: add `supports_references` property (default
False) and document the `references` kwarg contract.
- plugins/image_gen/openai-codex/__init__.py: set `supports_references=True`,
add `_load_reference_images` + `_build_user_content`, wire the references
kwarg through `generate()` with invalid-argument / invalid-reference error
handling.
- tools/image_generation_tool.py: expose `references` in the tool schema,
forward via the dispatcher, return `references_unsupported` for FAL and
any plugin provider that doesn't opt in.
- tests: new `TestReferences` class (7 cases) in the openai-codex tests,
new `TestReferencesDispatch` class (4 cases) in the dispatcher tests,
and the schema invariant test updated to match the new contract. Full
image_gen suite stays green (133 passed).
No breaking changes: existing providers (fal, openai, xai) unaffected —
they inherit `supports_references=False` and the dispatcher shields them
from reference payloads.
This was referenced May 11, 2026
This was referenced May 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an opt-in capability contract so
image_genproviders can accept reference images, and wires theopenai-codexplugin up as the first provider to opt in. Theimage_generatetool gains an optionalreferencesfield (array of local file paths); the dispatcher forwards it only when the active provider advertisessupports_references=True, otherwise rejects early withreferences_unsupportedinstead of silently dropping it.Motivation — the Codex
image_generationtool acceptsinput_imageafter allThe existing
baoyu-comicskill documentsimage_generateas "prompt-only — it does NOT accept reference images", and #14317 explicitly scoped the openai-codex plugin to thegenerateendpoint. This PR's finding contradicts both: passing one or moreinput_imagecontent items on the user message of the Codexresponses.stream(...)call works, and theimage_generationtool uses the attached images for composition, style transfer, and edits.Empirically verified against the live Codex backend with three tests:
(Happy to share the PNGs in a follow-up comment if useful — they're local and not on a public host yet.)
Scope
In:
agent/image_gen_provider.py:supports_referencesproperty (defaultsFalse) with doc for thereferenceskwarg.plugins/image_gen/openai-codex/__init__.py: opts in; adds_load_reference_images+_build_user_content; wires references throughgenerate()withinvalid_argument/invalid_referenceerror paths. Caps atMAX_REFERENCES = 16(matches OpenAI's documented upstream cap).tools/image_generation_tool.py:referencesin the agent-facing schema; dispatcher forwards only to capable providers; clearreferences_unsupportederror for FAL and any non-capable plugin.TestReferencesin the openai-codex suite (7 new cases),TestReferencesDispatchin the dispatcher suite (4 new cases), schema invariant test updated.Out (deliberately, happy to follow up in a separate PR):
openai(API-key) plugin in to references via the Images Edit endpoint.Non-breaking
fal,openai,xaiall inheritsupports_references=False— their behaviour is byte-identical to before.generate(), so no provider is silently passed kwargs it doesn't understand.image_generatetool with noreferencesfield continues to behave exactly as before.Test plan
pytest tests/plugins/image_gen/ tests/tools/test_image_generation.py tests/tools/test_image_generation_env.py tests/tools/test_image_generation_plugin_dispatch.py→ 133 passed.tools.image_generation_tool._handle_image_generate(same path a real tool call takes):{"prompt": "...", "references": ["/path/a.png"]}withimage_gen.provider: fal→ returnsreferences_unsupported._handle_image_generatewithimage_gen.provider: openai-codex: single-reference re-lighting prompt → returnssuccess=True,provider=openai-codex,references=1,imagepoints to a 1.9 MB PNG that preserves the reference subject. Full round-trip 30s.Open questions
referencessince it's what OpenAI's docs call them; other options includeinput_imagesorreference_images.image_edittool instead of growingimage_generate, I'm happy to refactor.