Skip to content

fix(feishu): download images in Feishu thread history and inline attachment hints [AI-assisted]#66432

Open
gee-bjak wants to merge 1 commit intoopenclaw:mainfrom
gee-bjak:fix/feishu-thread-history-image-download
Open

fix(feishu): download images in Feishu thread history and inline attachment hints [AI-assisted]#66432
gee-bjak wants to merge 1 commit intoopenclaw:mainfrom
gee-bjak:fix/feishu-thread-history-image-download

Conversation

@gee-bjak
Copy link
Copy Markdown

Problem

When an image (or other media) message exists in a Feishu thread's history, the agent only receives the placeholder text [image message] — it never sees the actual image data.

Root cause: resolveFeishuMediaList was only called for the current inbound message. Historical thread messages fetched via listFeishuThreadMessages went through parseFeishuMessageContent, which returns [image message] for msgType === "image" and similar placeholders for other media types.

Solution

For each message in thread history that has a media content type (image, file, audio, video, media, sticker, post), call resolveFeishuMediaList to download and save the media locally, then embed the saved path as a [media attached: /path (mime) | /path] hint in the message body.

This matches the format the embedded agent runner already uses for current-message attachments, so no changes to the runner are needed.

Changes

  • extensions/feishu/src/send.ts — Added optional rawContent field to FeishuThreadMessageInfo and populate it from item.body?.content so the raw Feishu API JSON is available for media key extraction.
  • extensions/feishu/src/bot.ts — Changed the synchronous historyParts.map() to an async for...of loop that calls resolveFeishuMediaList for media messages. Download errors are caught per-message and logged; history still renders without the attachment if download fails.
  • extensions/feishu/src/bot.test.ts — Added a test that sends a thread history message with contentType: "image" and rawContent, and asserts that downloadMessageResourceFeishu is called and the resulting path appears in ThreadHistoryBody.

Testing

pnpm test:extension feishu
Test Files  61 passed (61)
      Tests  615 passed (615)  ← 614 existing + 1 new

pnpm build passes cleanly.

pnpm check / pnpm tsgo has 3 pre-existing failures in extensions/qa-lab/src/scenario-catalog.test.ts that are already red on main CI (verified by checking out main and running the same check — same errors, no change).

AI Disclosure

  • AI-assisted: Authored by ClawGuard (an OpenClaw agent running in a Feishu thread) using claude-sonnet-4-6.
  • Degree of testing: Lightly tested — pnpm test:extension feishu passes. Not tested against a live Feishu account (no test credentials available in this environment).
  • Human understanding confirmed: The human operator (gee-bjak) reviewed the approach before execution.

…chment hints

Previously, image (and other media) messages in thread history were
silently reduced to a placeholder string like "[image message]".
The agent could see the placeholder text but never received the actual
image data.

Root cause: resolveFeishuMediaList was only called for the current
inbound message, never for historical thread messages fetched via
listFeishuThreadMessages.

Changes:
- Add rawContent (optional) field to FeishuThreadMessageInfo so the
  raw Feishu API body JSON is available after parsing.
- Store rawContent in listFeishuThreadMessages results.
- Replace the synchronous historyParts.map() with an async for-loop
  that calls resolveFeishuMediaList for each media message in thread
  history (image/file/audio/video/media/sticker/post types).
- Embed downloaded media paths as [media attached: /path (mime) | /path]
  hints in the message body, matching the format the agent runner
  already uses for current-message attachments.
- Errors are caught per-message and logged; history still renders
  without the attachment if download fails.
- Add a test covering the image download and ThreadHistoryBody hint.

AI-assisted: authored by ClawGuard (OpenClaw agent) via Feishu thread.
Testing: pnpm test:extension feishu — 615 tests passing (614 existing + 1 new).
@openclaw-barnacle openclaw-barnacle Bot added channel: feishu Channel integration: feishu size: S labels Apr 14, 2026
@wonglingyit-bjak
Copy link
Copy Markdown

Took a look through this — nice fix, the reuse of resolveFeishuMediaList keeps things clean.

A couple of things I'd want addressed before merging:

  • The mediaTypes array at bot.ts:965 gets re-allocated every iteration of the loop. Either hoist it out or just drop the guard entirely since resolveFeishuMediaList already bails early for non-media types internally.
  • There's no test covering what happens when the download fails (the catch at bot.ts:989). Would be good to have one since that's the main resilience guarantee here — one bad download shouldn't tank the whole thread history.

Also looked into whether this should be using media://inbound/<id> claim-check URIs instead of absolute paths for sandbox compatibility. Turns out the existing current-message Feishu media flow already emits absolute paths from saveMediaBuffer via buildAgentMediaPayload — the media:// scheme is a Gateway-side offload concern, not something channel extensions produce today. So this is consistent with how things already work. If we want media:// at the channel level that's a broader conversation across all channels, not something to block this PR on.

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 27, 2026

Codex review: needs changes before merge.

What this changes:

The PR preserves raw Feishu content on fetched thread-history messages, resolves media-typed history messages through the existing Feishu media downloader, appends inline attachment hints to ThreadHistoryBody, and adds a happy-path Feishu bot test.

Required change before merge:

The remaining blockers are narrow and mechanical enough for an automated repair attempt on the PR branch or a replacement branch: starter-media handling, aggregate history-media caps, focused tests, and changelog.

Security review:

Security review needs attention: Security review found no CI, dependency, secret, or supply-chain change, but the diff introduces a concrete resource-exhaustion concern by resolving every media item in fetched thread history without an aggregate cap.

Review findings:

  • [P2] Resolve media for the thread starter — extensions/feishu/src/bot.ts:956-958
  • [P2] Bound history media resolution — extensions/feishu/src/bot.ts:965
  • [P3] Add the required changelog entry — extensions/feishu/src/bot.ts:956
Review details

Best possible solution:

Land a bounded Feishu-owned fix: preserve rawContent for both fetched root/starter messages and fetched thread-history messages, resolve media only after sender visibility filtering, cap historical media resolution by recent messages/items, keep the existing attachment hint contract, add happy-path, failure-path, starter-media, and cap tests, and include the required changelog entry.

Do we have a high-confidence way to reproduce the issue?

Yes. A high-confidence source-level reproduction is available by mocking a Feishu thread history item with contentType image and parsed content [image message]: current main formats that placeholder into ThreadHistoryBody. A root/starter image remains a second reproducible gap because root messages still come from getMessageFeishu without rawContent and are excluded from listFeishuThreadMessages.

Is this the best way to solve the issue?

Partly. Reusing resolveFeishuMediaList and the existing [media attached: ...] hint format is the right direction, but the PR needs starter-message coverage and bounded history downloads before it is the narrow maintainable fix.

Full review comments:

  • [P2] Resolve media for the thread starter — extensions/feishu/src/bot.ts:956-958
    The new resolver only runs over historyMessages, but the code still passes rootMessageId to listFeishuThreadMessages and sets ThreadStarterBody from rootMsg?.content. Since getMessageFeishu still exposes only parsed content, an image/post/file used as the thread root remains [image message] with no attachment hint. Preserve raw content for fetched root messages too and run the same bounded resolver before assigning the starter body.
    Confidence: 0.86
  • [P2] Bound history media resolution — extensions/feishu/src/bot.ts:965
    This loop resolves media for every media-typed item in the fetched history. Thread bootstrap fetches up to 20 messages, and the Feishu downloader reads each resource into memory before saveMediaBuffer enforces mediaMaxMb, so one media-heavy thread can stall the first reply and consume avoidable bandwidth/memory. Add a small aggregate cap, such as recent media messages/items, before downloading history media.
    Confidence: 0.82
  • [P3] Add the required changelog entry — extensions/feishu/src/bot.ts:956
    This is a user-facing Feishu fix, but the PR changes only Feishu source/tests. Repo policy requires a CHANGELOG.md entry under Unreleased / Fixes; add one with appropriate contributor credit before merge.
    Confidence: 0.9

Overall correctness: patch is incorrect
Overall confidence: 0.84

Security concerns:

  • [medium] Unbounded history media downloads — extensions/feishu/src/bot.ts:965
    The new history loop may download media for every media-typed item in up to 20 fetched thread messages, while Feishu resource reads are buffered before the per-item mediaMaxMb check. A media-heavy allowed thread can therefore create avoidable network, memory, storage, and latency pressure during first bootstrap.
    Confidence: 0.82

Acceptance criteria:

  • pnpm test:extension feishu
  • pnpm exec oxfmt --check --threads=1 extensions/feishu/src/bot.ts extensions/feishu/src/send.ts extensions/feishu/src/types.ts extensions/feishu/src/bot.test.ts CHANGELOG.md
  • pnpm check:changed in Testbox before merge

What I checked:

  • Current inbound media path: Current main calls resolveFeishuMediaList only for the current inbound Feishu event before building MediaPath/MediaPaths for the agent payload. (extensions/feishu/src/bot.ts:853, 4babd925c40d)
  • Current thread history uses parsed content only: Thread bootstrap maps history messages directly to msg.content when constructing ThreadHistoryBody, with no media resolution step on main. (extensions/feishu/src/bot.ts:1152, 4babd925c40d)
  • Thread list drops raw media content on main: FeishuThreadMessageInfo exposes content/contentType but no rawContent, and listFeishuThreadMessages pushes only parsed content, so image_key/file_key cannot be recovered later. (extensions/feishu/src/send.ts:416, 4babd925c40d)
  • Parser produces placeholder symptom: parseFeishuMessageContent falls back to strings like [image message] when media content has no generic text/title field. (extensions/feishu/src/send.ts:343, 4babd925c40d)
  • Existing helper supports reuse but catches download failures internally: resolveFeishuMediaList already recognizes image/file/audio/video/media/sticker/post messages and logs failed downloads instead of throwing for ordinary media failures. (extensions/feishu/src/bot-content.ts:350, 4babd925c40d)
  • Attachment hint format is recognized: The embedded runner parses [media attached: path (mime) | url] style hints as image references, matching the PR's intended contract. (src/agents/pi-embedded-runner/run/images.ts:254, 4babd925c40d)

Likely related people:

  • Brian Qu: Feature-history search found commit 8a607d7 for fetching Feishu topic-thread context so the AI can see bot replies; this is the central path the PR modifies. (role: introduced related thread-bootstrap behavior; confidence: medium; commits: 8a607d755333; files: extensions/feishu/src/bot.ts, extensions/feishu/src/send.ts, extensions/feishu/src/bot.test.ts)
  • Peter Steinberger: Current checkout blame for the involved lines points at a recent baseline commit by Peter, and shortlog over the Feishu bot/send/media/test files shows the largest local contribution count. (role: recent maintainer and heavy Feishu-path contributor; confidence: medium; commits: 0640db72b059, 1e11b36d80, 8be3a4466c; files: extensions/feishu/src/bot.ts, extensions/feishu/src/send.ts, extensions/feishu/src/bot-content.ts)
  • Vincent Koc: Recent Feishu history includes multiple runtime seam, dedupe, lifecycle, and bot-test refactors by Vincent near the code paths this PR changes. (role: recent adjacent Feishu runtime/test maintainer; confidence: medium; commits: a3b9ae8fab, feca4aa49e, c96871db30; files: extensions/feishu/src/bot.ts, extensions/feishu/src/bot.test.ts, extensions/feishu/src/bot-content.ts)
  • wonglingyit-bjak: Their PR comment reviewed this media approach, confirmed the existing absolute-path hint contract, and identified remaining merge polish around loop cleanup and failed-download coverage. (role: reviewer with Feishu media context; confidence: medium; files: extensions/feishu/src/bot.ts, extensions/feishu/src/bot.test.ts)

Remaining risk / open question:

  • The PR overlaps the adjacent open Feishu pending-history media PR Feishu: include pending history media (images) in inbound context #54967, which appears to use explicit message/media bounds; maintainers should avoid landing duplicate rawContent/media-resolution paths that diverge.
  • The proposed history loop can trigger many Feishu resource downloads during first thread bootstrap unless it adds a small aggregate cap and tests that cap.
  • No live Feishu account validation is shown; the available proof is source inspection plus the PR's mocked Feishu extension test claim.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 4babd925c40d.

@gee-bjak gee-bjak marked this pull request as ready for review April 27, 2026 05:38
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 27, 2026

Greptile Summary

This PR resolves a gap where media messages in Feishu thread history were only ever surfaced as placeholder text ([image message]). It adds a rawContent field to FeishuThreadMessageInfo, converts the synchronous historyParts.map() to an async for...of loop, and calls the existing resolveFeishuMediaList helper for each media-typed history message before assembling the envelope. The logic is correct and consistent with how the current-message path handles attachments; all P2 findings are minor (constant hoisting, sequential vs. parallel downloads, missing error-path test).

Confidence Score: 4/5

Safe to merge; no correctness or security issues found, only minor style and test-coverage observations.

All findings are P2: a redundant array allocation inside a loop, sequential rather than parallel async downloads, and a missing error-path test case. No logic bugs, data-loss risks, or security concerns were identified.

No files require special attention beyond the P2 notes above.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: extensions/feishu/src/bot.ts
Line: 965

Comment:
**`mediaTypes` array recreated on every loop iteration**

`mediaTypes` is declared with `const` inside the `for...of` loop body, so a new array is allocated for every history message. Since this list is a fixed constant, it should be hoisted above the loop (or de-duplicated against the identical array already defined inside `resolveFeishuMediaList`).

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/feishu/src/bot.ts
Line: 960-994

Comment:
**Sequential media downloads across history messages**

Each `resolveFeishuMediaList` call is awaited one-at-a-time inside the loop. With up to 20 thread history messages (the `limit` used in `listFeishuThreadMessages`) that each contain media, this processes downloads serially. Collecting all the media-resolution promises and awaiting them with `Promise.all` (then reassembling the results in order) would significantly reduce latency in the common case.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/feishu/src/bot.test.ts
Line: 2672-2751

Comment:
**Error-path not exercised by the new test**

The test covers the happy path (download succeeds, hint appears in `ThreadHistoryBody`), but the per-message error path is untested: when `downloadMessageResourceFeishu` rejects, the expectation is that the message still appears in history (without the attachment hint) and a log entry is emitted. Worth adding a second case that mocks a rejection and asserts the envelope is still pushed with the original `[image message]` body.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(feishu): download images in Feishu t..." | Re-trigger Greptile

let msgBody = msg.content;

// Download and inline media for image/file/audio/video/sticker/post messages.
const mediaTypes = ["image", "file", "audio", "video", "media", "sticker", "post"];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 mediaTypes array recreated on every loop iteration

mediaTypes is declared with const inside the for...of loop body, so a new array is allocated for every history message. Since this list is a fixed constant, it should be hoisted above the loop (or de-duplicated against the identical array already defined inside resolveFeishuMediaList).

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/feishu/src/bot.ts
Line: 965

Comment:
**`mediaTypes` array recreated on every loop iteration**

`mediaTypes` is declared with `const` inside the `for...of` loop body, so a new array is allocated for every history message. Since this list is a fixed constant, it should be hoisted above the loop (or de-duplicated against the identical array already defined inside `resolveFeishuMediaList`).

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +960 to +994
for (const msg of historyMessages) {
const role = msg.senderType === "app" ? "assistant" : "user";
return core.channel.reply.formatAgentEnvelope({
channel: "Feishu",
from: `${msg.senderId ?? "Unknown"} (${role})`,
timestamp: msg.createTime,
body: msg.content,
envelope: envelopeOptions,
});
});
let msgBody = msg.content;

// Download and inline media for image/file/audio/video/sticker/post messages.
const mediaTypes = ["image", "file", "audio", "video", "media", "sticker", "post"];
if (mediaTypes.includes(msg.contentType)) {
try {
const mediaItems = await resolveFeishuMediaList({
cfg,
messageId: msg.messageId,
messageType: msg.contentType,
content: msg.rawContent ?? "",
maxBytes: mediaMaxBytes,
log,
accountId: account.accountId,
});
if (mediaItems.length > 0) {
const attachmentHints = mediaItems
.map(
(m) =>
`[media attached: ${m.path} (${m.contentType ?? "application/octet-stream"}) | ${m.path}]`,
)
.join("\n");
msgBody = [msgBody, attachmentHints].filter(Boolean).join("\n");
log(
`feishu[${account.accountId}]: downloaded ${mediaItems.length} media item(s) from thread history message ${msg.messageId}`,
);
}
} catch (err) {
log(
`feishu[${account.accountId}]: failed to download media for thread history message ${msg.messageId}: ${String(err)}`,
);
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Sequential media downloads across history messages

Each resolveFeishuMediaList call is awaited one-at-a-time inside the loop. With up to 20 thread history messages (the limit used in listFeishuThreadMessages) that each contain media, this processes downloads serially. Collecting all the media-resolution promises and awaiting them with Promise.all (then reassembling the results in order) would significantly reduce latency in the common case.

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/feishu/src/bot.ts
Line: 960-994

Comment:
**Sequential media downloads across history messages**

Each `resolveFeishuMediaList` call is awaited one-at-a-time inside the loop. With up to 20 thread history messages (the `limit` used in `listFeishuThreadMessages`) that each contain media, this processes downloads serially. Collecting all the media-resolution promises and awaiting them with `Promise.all` (then reassembling the results in order) would significantly reduce latency in the common case.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 2672 to 2751
);
});

it("downloads images from thread history messages and inlines attachment hints", async () => {
mockShouldComputeCommandAuthorized.mockReturnValue(false);
mockGetMessageFeishu.mockResolvedValue({
messageId: "om_topic_root",
chatId: "oc-group",
content: "thread root",
contentType: "text",
threadId: "omt_topic_img",
});
const imgRawContent = JSON.stringify({ image_key: "img_history_key_1" });
mockListFeishuThreadMessages.mockResolvedValue([
{
messageId: "om_img_msg",
senderId: "ou-sender",
senderType: "user",
content: "[image message]",
contentType: "image",
rawContent: imgRawContent,
createTime: 1710000000000,
},
]);
mockDownloadMessageResourceFeishu.mockResolvedValueOnce({
buffer: Buffer.from("fake-image-data"),
contentType: "image/png",
});
mockSaveMediaBuffer.mockResolvedValueOnce({
path: "/fake/.openclaw/media/inbound/thread-img.png",
contentType: "image/png",
});

const cfg: ClawdbotConfig = {
channels: {
feishu: {
groups: {
"oc-group": {
requireMention: false,
groupSessionScope: "group_topic",
},
},
},
},
} as ClawdbotConfig;

const event: FeishuMessageEvent = {
sender: { sender_id: { open_id: "ou-sender" } },
message: {
message_id: "om_text_followup",
root_id: "om_topic_root",
thread_id: "omt_topic_img",
chat_id: "oc-group",
chat_type: "group",
message_type: "text",
content: JSON.stringify({ text: "what was in that image?" }),
},
};

await dispatchMessage({ cfg, event });

// The image should have been downloaded via the resource API.
expect(mockDownloadMessageResourceFeishu).toHaveBeenCalledWith(
expect.objectContaining({
messageId: "om_img_msg",
fileKey: "img_history_key_1",
type: "image",
}),
);
// ThreadHistoryBody should include the attachment hint with the saved path.
expect(mockFinalizeInboundContext).toHaveBeenCalledWith(
expect.objectContaining({
ThreadHistoryBody: expect.stringContaining("/fake/.openclaw/media/inbound/thread-img.png"),
}),
);
});

it("does not dispatch twice for the same image message_id (concurrent dedupe)", async () => {
mockShouldComputeCommandAuthorized.mockReturnValue(false);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Error-path not exercised by the new test

The test covers the happy path (download succeeds, hint appears in ThreadHistoryBody), but the per-message error path is untested: when downloadMessageResourceFeishu rejects, the expectation is that the message still appears in history (without the attachment hint) and a log entry is emitted. Worth adding a second case that mocks a rejection and asserts the envelope is still pushed with the original [image message] body.

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/feishu/src/bot.test.ts
Line: 2672-2751

Comment:
**Error-path not exercised by the new test**

The test covers the happy path (download succeeds, hint appears in `ThreadHistoryBody`), but the per-message error path is untested: when `downloadMessageResourceFeishu` rejects, the expectation is that the message still appears in history (without the attachment hint) and a log entry is emitted. Worth adding a second case that mocks a rejection and asserts the envelope is still pushed with the original `[image message]` body.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: feishu Channel integration: feishu size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants