Skip to content

Commit f0b327c

Browse files
Bartok9steipete
andauthored
fix(media): gate markdown image extraction by channel (#72718)
Closes #72642 Co-authored-by: Peter Steinberger <steipete@gmail.com>
1 parent 775ed36 commit f0b327c

15 files changed

Lines changed: 251 additions & 30 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,7 @@ Docs: https://docs.openclaw.ai
9797
- Google Meet: route stateful `google_meet` tool actions through the gateway-owned runtime so created or joined realtime sessions remain visible to status, speak, and leave after the agent turn ends. Fixes #72440. (#72441) Thanks @BsnizND.
9898
- Google Meet/Voice Call: send Gemini Live a non-blocking consult continuation before long OpenClaw agent consults finish, then deliver the final result when idle so calls and meetings do not sit silent during tool-backed answers. (#72189) Thanks @VACInc.
9999
- Google Meet: preserve Gemini Live function names when replying to realtime tool calls so Google SDK validation accepts the `FunctionResponse` payload. Fixes #72425. (#72426) Thanks @BsnizND.
100+
- Discord/media: keep incidental Markdown image badges in final replies as text unless a channel opts into Markdown-image media extraction, while preserving Telegram Markdown-image media replies and explicit `MEDIA:` attachments. Fixes #72642. Thanks @solavrc and @Bartok9.
100101
- Matrix/E2EE: stabilize recovery and broken-device QA flows while avoiding Matrix device-cleanup sync races that could leave shutdown-time crypto work running. Thanks @gumadeiras.
101102
- Cron: apply `cron.maxConcurrentRuns` to a dedicated `cron-nested` isolated agent-turn lane as well as cron dispatch, so parallel cron jobs no longer serialize on inner LLM execution while non-cron nested flows keep their existing lane behavior. Fixes #72707. Thanks @kagura-agent.
102103
- Cron: report isolated runs as successful when verified cron delivery already delivered the reply, while keeping unresolved Message/Canvas tool failures fatal. Fixes #72732 and #50170; follow-up to #54188. Thanks @zNatix, @pixeldyn, and @ChickenEggRoll.

docs/reference/rich-output-protocol.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,10 @@ Remote `MEDIA:` attachments must be public `https:` URLs. Plain `http:`,
1717
loopback, link-local, private, and internal hostnames are ignored as attachment
1818
directives; server-side media fetchers still enforce their own network guards.
1919

20+
Plain Markdown image syntax stays text by default. Channels that intentionally
21+
map Markdown image replies to media attachments opt in at their outbound
22+
adapter; Telegram does this so `![alt](url)` can still become a media reply.
23+
2024
These directives are separate. `MEDIA:` and reply/voice tags remain delivery metadata; `[embed ...]` is the web-only rich render path.
2125
Trusted tool-result media uses the same `MEDIA:` / `[[audio_as_voice]]` parser before delivery, so text tool outputs can still mark an audio attachment as a voice note.
2226

extensions/telegram/src/outbound-adapter.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,7 @@ export const telegramOutbound: ChannelOutboundAdapter = {
121121
deliveryMode: "direct",
122122
chunker: markdownToTelegramHtmlChunks,
123123
chunkerMode: "markdown",
124+
extractMarkdownImages: true,
124125
textChunkLimit: TELEGRAM_TEXT_CHUNK_LIMIT,
125126
sanitizeText: ({ text }) => sanitizeForPlainText(text),
126127
shouldSkipPlainTextSanitization: ({ payload }) => Boolean(payload.channelData),

extensions/telegram/src/outbound-base.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ export const telegramOutboundBaseAdapter = {
44
deliveryMode: "direct" as const,
55
chunker: chunkMarkdownText,
66
chunkerMode: "markdown" as const,
7+
extractMarkdownImages: true,
78
textChunkLimit: 4000,
89
pollMaxOptions: 10,
910
};

src/auto-reply/reply/agent-runner-payloads.test.ts

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -350,6 +350,7 @@ describe("buildReplyPayloads media filter integration", () => {
350350
it("extracts markdown image replies into final payload media urls", async () => {
351351
const { replyPayloads } = await buildReplyPayloads({
352352
...baseParams,
353+
extractMarkdownImages: true,
353354
payloads: [{ text: "Here you go\n\n![chart](https://example.com/chart.png)" }],
354355
});
355356

@@ -364,6 +365,7 @@ describe("buildReplyPayloads media filter integration", () => {
364365
it("preserves inline caption text when lifting markdown image replies into media", async () => {
365366
const { replyPayloads } = await buildReplyPayloads({
366367
...baseParams,
368+
extractMarkdownImages: true,
367369
payloads: [{ text: 'Look ![chart](https://example.com/chart.png "Quarterly chart") now' }],
368370
});
369371

@@ -379,6 +381,7 @@ describe("buildReplyPayloads media filter integration", () => {
379381
const text = "Look ![chart](file:///etc/passwd) now";
380382
const { replyPayloads } = await buildReplyPayloads({
381383
...baseParams,
384+
extractMarkdownImages: true,
382385
payloads: [{ text }],
383386
});
384387

src/auto-reply/reply/agent-runner-payloads.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,7 @@ export async function buildReplyPayloads(params: {
107107
originatingChannel?: OriginatingChannelType;
108108
originatingTo?: string;
109109
accountId?: string;
110+
extractMarkdownImages?: boolean;
110111
normalizeMediaPaths?: (payload: ReplyPayload) => Promise<ReplyPayload>;
111112
}): Promise<{ replyPayloads: ReplyPayload[]; didLogHeartbeatStrip: boolean }> {
112113
let didLogHeartbeatStrip = params.didLogHeartbeatStrip;
@@ -148,6 +149,7 @@ export async function buildReplyPayloads(params: {
148149
currentMessageId: params.currentMessageId,
149150
silentToken: SILENT_REPLY_TOKEN,
150151
parseMode: "always",
152+
extractMarkdownImages: params.extractMarkdownImages,
151153
});
152154
const mediaNormalizedPayload = await normalizeReplyPayloadMedia({
153155
payload: parsed.payload,

src/auto-reply/reply/reply-delivery.ts

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ export function normalizeReplyPayloadDirectives(params: {
1717
silentToken?: string;
1818
trimLeadingWhitespace?: boolean;
1919
parseMode?: ReplyDirectiveParseMode;
20+
extractMarkdownImages?: boolean;
2021
}): { payload: ReplyPayload; isSilent: boolean } {
2122
const parseMode = params.parseMode ?? "always";
2223
const silentToken = params.silentToken ?? SILENT_REPLY_TOKEN;
@@ -27,12 +28,14 @@ export function normalizeReplyPayloadDirectives(params: {
2728
(parseMode === "auto" &&
2829
(sourceText.includes("[[") ||
2930
/media:/i.test(sourceText) ||
31+
(params.extractMarkdownImages === true && /!\[[^\]]*]\(/.test(sourceText)) ||
3032
sourceText.includes(silentToken)));
3133

3234
const parsed = shouldParse
3335
? parseReplyDirectives(sourceText, {
3436
currentMessageId: params.currentMessageId,
3537
silentToken,
38+
extractMarkdownImages: params.extractMarkdownImages,
3639
})
3740
: undefined;
3841

src/auto-reply/reply/reply-directives.ts

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,19 @@ export type ReplyDirectiveParseResult = {
1313
isSilent: boolean;
1414
};
1515

16+
export type ReplyDirectiveParseOptions = {
17+
currentMessageId?: string;
18+
silentToken?: string;
19+
extractMarkdownImages?: boolean;
20+
};
21+
1622
export function parseReplyDirectives(
1723
raw: string,
18-
options: { currentMessageId?: string; silentToken?: string } = {},
24+
options: ReplyDirectiveParseOptions = {},
1925
): ReplyDirectiveParseResult {
20-
const split = splitMediaFromOutput(raw);
26+
const split = splitMediaFromOutput(raw, {
27+
extractMarkdownImages: options.extractMarkdownImages,
28+
});
2129
let text = split.text ?? "";
2230

2331
const replyParsed = parseInlineDirectives(text, {

src/channels/plugins/outbound.types.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,8 @@ export type ChannelOutboundAdapter = {
7676
deliveryMode: "direct" | "gateway" | "hybrid";
7777
chunker?: ((text: string, limit: number, ctx?: ChannelOutboundChunkContext) => string[]) | null;
7878
chunkerMode?: "text" | "markdown";
79+
/** Lift remote Markdown image syntax in text into outbound media attachments. */
80+
extractMarkdownImages?: boolean;
7981
textChunkLimit?: number;
8082
sanitizeText?: (params: { text: string; payload: ReplyPayload }) => string;
8183
pollMaxOptions?: number;

src/infra/outbound/deliver.test.ts

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1235,6 +1235,54 @@ describe("deliverOutboundPayloads", () => {
12351235
);
12361236
});
12371237

1238+
it("keeps markdown images as text for channels that do not opt in", async () => {
1239+
const sendMatrix = vi.fn().mockResolvedValue({ messageId: "m-text", roomId: "!room" });
1240+
1241+
await deliverOutboundPayloads({
1242+
cfg: matrixChunkConfig,
1243+
channel: "matrix",
1244+
to: "!room:example",
1245+
payloads: [{ text: "Tech: ![Node.js](https://img.shields.io/badge/Node.js-339933)" }],
1246+
deps: { matrix: sendMatrix },
1247+
});
1248+
1249+
expect(sendMatrix).toHaveBeenCalledWith(
1250+
"!room:example",
1251+
"Tech: ![Node.js](https://img.shields.io/badge/Node.js-339933)",
1252+
expect.not.objectContaining({ mediaUrl: expect.any(String) }),
1253+
);
1254+
});
1255+
1256+
it("extracts markdown images for channels that opt in", async () => {
1257+
const sendMatrix = vi.fn().mockResolvedValue({ messageId: "m-media", roomId: "!room" });
1258+
setActivePluginRegistry(
1259+
createTestRegistry([
1260+
{
1261+
pluginId: "matrix",
1262+
source: "test",
1263+
plugin: createOutboundTestPlugin({
1264+
id: "matrix",
1265+
outbound: { ...matrixOutboundForTest, extractMarkdownImages: true },
1266+
}),
1267+
},
1268+
]),
1269+
);
1270+
1271+
await deliverOutboundPayloads({
1272+
cfg: matrixChunkConfig,
1273+
channel: "matrix",
1274+
to: "!room:example",
1275+
payloads: [{ text: "Chart ![chart](https://example.com/chart.png) now" }],
1276+
deps: { matrix: sendMatrix },
1277+
});
1278+
1279+
expect(sendMatrix).toHaveBeenCalledWith(
1280+
"!room:example",
1281+
"Chart now",
1282+
expect.objectContaining({ mediaUrl: "https://example.com/chart.png" }),
1283+
);
1284+
});
1285+
12381286
it("normalizes payloads and drops empty entries", () => {
12391287
const normalized = normalizeOutboundPayloads([
12401288
{ text: "hi" },

0 commit comments

Comments
 (0)