Skip to content

Codex CLI Image Processing #1619

@nmurrell07

Description

@nmurrell07

What version of Codex is running?

0.7.0

Which model were you using?

o3

What platform is your computer?

Linux 6.11.0-29 generic x86_64 x86_64

What steps can reproduce the bug?

https://help.openai.com/en/articles/11096431-openai-codex-cli-getting-started shows the following as functionality for Codex CLI:

Multimodal inputs – pass text, screenshots, or diagrams and let the agent generate or edit code accordingly.

As a user, I am unable to utilize Codex CLI for image view and review. Codex CLI is aware of it's own ability to review images, however, provides directions to provide screenshots via the web UI version of Codex. As a user, I would like for Codex CLI to be able to capture a screenshot from within the Godot application in headless mode, review the screenshot, then iterate code based on the review. This loop is currently possible with Claude Code and Gemini CLI. I have tried various MCP servers with image related capabilities but to no effect.

Could you please review and update Codex CLI's image review capabilities either by making it so or updating the documentation? Or, provide me with instructions or a few ideas to follow up on in order to reach a breakthrough?

What is the expected behavior?

Codex CLI captures and reviews screenshots from within the Godot project then iterates code and reviews a new screenshot to verify the results.

What do you see instead?

Images converted to text, quadrants summarized by color saturation in order to determine object placement on screen.

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions