Skip to content

fix(read): persist image data and inject MEDIA directive for channel delivery#11754

Closed
QDenka wants to merge 2 commits intoopenclaw:mainfrom
QDenka:fix/issue-11735
Closed

fix(read): persist image data and inject MEDIA directive for channel delivery#11754
QDenka wants to merge 2 commits intoopenclaw:mainfrom
QDenka:fix/issue-11735

Conversation

@QDenka
Copy link

@QDenka QDenka commented Feb 8, 2026

Summary

When the read tool reads an image file, the base64 image data is returned as a content block visible to the LLM but never converted to a deliverable media URL. This means images read by agents are not sent to Telegram or other channels — the user only sees the agent's text reply without the image.

Changes

  • src/agents/pi-tools.read.ts: After reading an image, persist the base64 data to a cache file under .openclaw/media-cache/ in the workspace and inject a MEDIA:./relative-path directive into the text content block. The existing delivery pipeline then picks up the relative path via splitMediaFromOutput and sends the image to the channel.
  • src/agents/pi-tools.ts: Pass workspaceRoot to createOpenClawReadTool for the non-sandboxed path.
  • src/agents/pi-tools.read.image-delivery.test.ts: Tests verifying MEDIA injection for image reads, no injection for text reads, and no injection without workspaceRoot.

How it works

  1. Agent calls read on an image file
  2. Read tool returns { type: 'image', data: '<base64>', mimeType: 'image/png' } content block
  3. New: The image data is persisted to .openclaw/media-cache/<hash>.png
  4. New: A MEDIA:./… directive is appended to the text content block
  5. The delivery pipeline (splitMediaFromOutput) extracts the media URL
  6. Image is sent to Telegram/other channels via sendMedia

Fixes #11735

Greptile Overview

Greptile Summary

This PR updates the read tool wrapper so that when an image is read (base64 image content block), the image payload is persisted into a workspace-local cache directory (.openclaw/media-cache/) and a MEDIA: directive is appended to the tool’s text output. The existing media parsing/delivery pipeline (splitMediaFromOutput → channel sendMedia) can then detect the directive and deliver the image to downstream channels. It also wires workspaceRoot through the non-sandboxed tool creation path and adds a Vitest suite covering the injection behavior.

Confidence Score: 2/5

  • This PR has a few correctness issues that can break media delivery and should be fixed before merging.
  • The core idea (persist + MEDIA directive) fits the existing splitMediaFromOutput pipeline, but the current directive/path formatting and injection behavior are inconsistent with how MEDIA tokens are parsed/consumed, and the persistence step can produce invalid/corrupted files without detection. These are likely to cause real delivery failures or duplicated MEDIA extraction in normal operation.
  • src/agents/pi-tools.read.ts (MEDIA path format, injection target, image persistence validation); src/agents/pi-tools.read.image-delivery.test.ts (would currently fail once path semantics are corrected/validated)

(4/5) You can add custom instructions or style guidelines for the agent here!

…delivery

When the read tool reads an image file, the base64 image data is returned
as a content block visible to the LLM but never converted to a deliverable
media URL. This means images read by agents are not sent to Telegram or
other channels.

Fix: after reading an image, persist the base64 data to a cache file under
.openclaw/media-cache/ in the workspace and inject a MEDIA: directive into
the text content block. The delivery pipeline then picks up the relative
path and sends the image to the channel.

Fixes openclaw#11735
@openclaw-barnacle openclaw-barnacle bot added the agents Agent runtime and tooling label Feb 8, 2026
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +304 to +307
): Promise<string | undefined> {
const ext = MIME_TO_EXT[imageBlock.mimeType] ?? "png";
const hash = createHash("sha256")
.update(imageBlock.data.slice(0, 1024))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broken MEDIA path

persistReadImage returns a path like ./.openclaw/media-cache/..., but later code (and the test) treat the MEDIA: payload as a relative path without the leading ./ (see splitMediaFromOutput which returns ./... and the test’s match(/MEDIA:\.\/(.+)/)). With the current implementation, splitMediaFromOutput will parse MEDIA:././.openclaw/... and the extracted token becomes ././.openclaw/..., which will not map cleanly back to the workspace when consumers do join(workspaceRoot, token).

This breaks delivery for the generated directives unless the receiver happens to normalize ././.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-tools.read.ts
Line: 304:307

Comment:
**Broken MEDIA path**

`persistReadImage` returns a path like `./.openclaw/media-cache/...`, but later code (and the test) treat the `MEDIA:` payload as a *relative path without the leading `./`* (see `splitMediaFromOutput` which returns `./...` and the test’s `match(/MEDIA:\.\/(.+)/)`). With the current implementation, `splitMediaFromOutput` will parse `MEDIA:././.openclaw/...` and the extracted token becomes `././.openclaw/...`, which will not map cleanly back to the workspace when consumers do `join(workspaceRoot, token)`.

This breaks delivery for the generated directives unless the receiver happens to normalize `././`.


How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +327 to +341
function injectMediaDirective(
result: AgentToolResult<unknown>,
mediaPath: string,
): AgentToolResult<unknown> {
const content = Array.isArray(result.content) ? result.content : [];
const nextContent = content.map((block) => {
if (
block &&
typeof block === "object" &&
(block as { type?: unknown }).type === "text" &&
typeof (block as { text?: unknown }).text === "string"
) {
const b = block as TextContentBlock & { text: string };
return { ...b, text: `${b.text}\nMEDIA:${mediaPath}` } satisfies TextContentBlock;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEDIA injected into all text blocks

injectMediaDirective appends MEDIA: to every text block in the tool result. If read returns multiple text blocks (e.g., header + extra notes/errors), this will create multiple MEDIA: tokens; splitMediaFromOutput will then extract duplicates and also strip those lines from the user-visible text.

This should only inject into a single, intended text block (typically the first/header), or ensure only one directive is appended.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-tools.read.ts
Line: 327:341

Comment:
**MEDIA injected into all text blocks**

`injectMediaDirective` appends `MEDIA:` to *every* text block in the tool result. If `read` returns multiple text blocks (e.g., header + extra notes/errors), this will create multiple `MEDIA:` tokens; `splitMediaFromOutput` will then extract duplicates and also strip those lines from the user-visible text.

This should only inject into a single, intended text block (typically the first/header), or ensure only one directive is appended.


How can I resolve this? If you propose a fix, please make it concise.

@openclaw-barnacle
Copy link

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle bot added stale Marked as stale due to inactivity and removed stale Marked as stale due to inactivity labels Feb 21, 2026
@QDenka QDenka closed this Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Agent read tool returns image but not sent to Telegram channel

2 participants