Skip to content

feat: image content#3887

Merged
badlogic merged 21 commits into
earendil-works:mainfrom
cristinaponcela:feat/image-outputs
May 8, 2026
Merged

feat: image content#3887
badlogic merged 21 commits into
earendil-works:mainfrom
cristinaponcela:feat/image-outputs

Conversation

@cristinaponcela

@cristinaponcela cristinaponcela commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

This PR adds a new API, closely mirroring the stream API, to support image blocks and image models (via Google/ OpenRouter) so the agent can output images.

With a simple (clanked) extension, you can test in the TUI:

Screenshot 2026-05-04 at 17 39 07

@badlogic

badlogic commented May 4, 2026

Copy link
Copy Markdown
Collaborator

clsoing this out, as we have a new plan.

@badlogic badlogic closed this May 4, 2026
@badlogic badlogic reopened this May 4, 2026
@badlogic

badlogic commented May 4, 2026

Copy link
Copy Markdown
Collaborator

wait, shit, i thought that was the old PR :D

@badlogic badlogic left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall squeaky clean! great job. left some minor comments.

@@ -0,0 +1,202 @@
import OpenAI from "openai";

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, i wonder if we should mvoe imagegen provider impls to their own subfolder, to separate them from the normal providers. how do you feel about that @cristinaponcela ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, cleaner. ✅ 60e885f

Comment thread packages/ai/src/images.ts Outdated
return provider.images(model, context, options as ImagesOptions);
}

export async function completeImages<TApi extends ImagesApi>(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to generateImages

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0d96b9b

@@ -0,0 +1,111 @@
import { beforeEach, describe, expect, it, vi } from "vitest";

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll need an end-to-end test as well. see stream.test.ts as a sort of template.

  • list of functions all provider specific tests can call into
  • one describe per provider with "sub tests" calling into the functions with a known-good model from that provider

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Done, only 1 describe bc 1 provider (OpenRouter), using Gemini 2.5 flash image preview.

3728e4b

Comment thread packages/ai/README.md
@@ -1249,10 +1249,11 @@ Create a new provider file (for example `amazon-bedrock.ts`) that exports:
- Add credential detection in `env-api-keys.ts` for the new provider

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ai/README.md needs a dedicated section on image generation explaining all the ins and outs and don'ts, following the style of the rest of the README.md.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

074747c

Do lmk if this needs more detail/ is too verbose, first time writing public docs :3

@cristinaponcela cristinaponcela requested a review from badlogic May 5, 2026 08:08
@badlogic badlogic added the inprogress Issue is being worked on label May 7, 2026
@badlogic

badlogic commented May 7, 2026

Copy link
Copy Markdown
Collaborator

@cristinaponcela I think we should remove images() entirely here.

The OpenRouter provider sends stream: false, so images() is not actually streaming anything. It waits for the full HTTP response and then emits synthetic events after the fact. That makes the API misleading, especially because generateImages() can return both image and text blocks, but the current events only expose image blocks.

Let's keep this as a one-shot API only:

  • generateImages(model, context, options) returns the final result
  • result output can contain image blocks and text blocks
  • no AssistantImagesEventStream
  • no images() helper
  • no image event protocol

Abort support is still needed. The current provider already passes options.signal into the OpenAI request options, so please keep that behavior when moving generateImages() to call the provider directly.

Please also update packages/ai/README.md accordingly.

@badlogic badlogic merged commit 9751057 into earendil-works:main May 8, 2026
1 check passed
@badlogic

badlogic commented May 8, 2026

Copy link
Copy Markdown
Collaborator

looking good, cheers!

ziye0180 added a commit to ziye0180/ziye-pi that referenced this pull request May 15, 2026
补充缺失条目:
- packages/ai: 图片输出功能 (PR earendil-works#3887), Bun WebSocket 代理修复, Fireworks 会话亲和性修复
- packages/tui: 列表项缩进包裹, 复选框渲染, 大文件 markdown 鲁棒性, 图片放置修复
- packages/coding-agent: TTY 异常恢复, .agents 来源保留, 跨包同步图片输出功能

ziye

Co-Authored-By: ziye <ziye0180@outlook.com>
larsboes pushed a commit to larsboes/pi-mono that referenced this pull request May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

inprogress Issue is being worked on

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants