fix(reply): parse markdown image replies as media by vincentkoc · Pull Request #66471 · openclaw/openclaw

vincentkoc · 2026-04-14T09:50:40Z

Summary

parse markdown image syntax in outbound reply text the same way we already parse MEDIA: tokens
keep surrounding caption text while lifting ![...](...) targets into mediaUrl / mediaUrls
add regression coverage for the shared parser and the final reply-payload builder path used before outbound delivery

Root cause

OpenClaw's final reply normalization path only extracted outbound media from explicit MEDIA: tokens. Replies that used markdown image syntax like ![chart](https://example.com/chart.png) stayed text-only, so channels expecting a normalized media payload fell back to sending the raw URL as plain text instead of a media message.

That showed up most obviously in Telegram group chats because those turns lean on the normalized final reply path more heavily once tool-summary text is suppressed.

Design

extend src/media/parse.ts to recognize markdown image targets and lift them into ordered media segments
reuse the existing final reply-payload build path in src/auto-reply/reply/agent-runner-payloads.ts so no Telegram-specific special case is needed
leave invalid markdown image targets as plain text

Validation

pnpm test:serial src/media/parse.test.ts src/auto-reply/reply/agent-runner-payloads.test.ts
pnpm build

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2dfcd77e71

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

greptile-apps · 2026-04-14T09:55:20Z

Greptile Summary

This PR extends src/media/parse.ts to recognize markdown image syntax (![alt](url)) and lift matched URLs into the mediaUrls/mediaUrl fields, mirroring the existing MEDIA: token path. The fix targets the final reply normalization path (buildReplyPayloads with parseMode: "always"), which is the path Telegram group chats lean on most.

One gap: collectMarkdownImageSegments only adds cleanedLine (text after the last image) to keptLines. When an image appears inline with leading text on the same line — e.g. "Caption ![img](url)" — the preceding caption is added to lineSegments but not to keptLines, so it is silently dropped from the returned text field. The new tests cover the common LLM pattern (images on their own lines) and pass, but the inline-caption case is untested and would lose caption text.

Confidence Score: 4/5

Safe to merge for the documented Telegram group chat case; the inline-caption text loss is a pre-existing gap in the new code worth addressing before wider rollout.
All new tests pass for the common LLM output pattern. One P2 issue: caption text on the same line as an inline image is silently dropped from the text field (only lands in segments). LLMs rarely produce inline images, so practical impact is low, but it is incorrect behavior in the changed code path.
src/media/parse.ts — specifically the keptLines handling inside the collectMarkdownImageSegments caller block (lines ~247-259).

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/media/parse.ts
Line: 133-180

Comment:
**Inline caption text silently dropped from `text`**

When a markdown image appears inline with leading caption text on the same line (e.g. `"See this: ![chart](url)"`), the text before the image is pushed to `lineSegments` but `pieces` is then cleared, so `cleanedLine` only captures text *after* the last image. The caller only adds `cleanedLine` to `keptLines`, so any preceding caption is lost from `cleanedText`/`text` — it exists only in `segments`, which the reply path (`parseReplyDirectives`) doesn't consume.

For `"Caption ![img](https://…)"` the result is `{ text: "", mediaUrls: ["https://…"] }` — the caption disappears silently. The new tests cover images on their own lines (the common LLM pattern) but leave the inline case untested.

A straightforward fix is to collect all non-empty text segments into `keptLines` as well:

```ts
for (const segment of markdownImageResult.lineSegments) {
  if (segment.type === "text") {
    keptLines.push(segment.text);   // keep inline caption in cleanedText
    pushTextSegment(segment.text);
    continue;
  }
  segments.push(segment);
}
```

and skip the separate `if (markdownImageResult.cleanedLine)` push (it's already handled above).

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix(reply): parse markdown image replies..." | Re-trigger Greptile}

EronFan · 2026-04-14T09:56:29Z

Good fix. Extending src/media/parse.ts to handle ![...](...) markdown image syntax is the right approach — consistent with existing MEDIA: token handling, no channel-specific special cases needed. Regression coverage added. ✅

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: acab679138

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 75a56f7915

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-14T10:16:10Z

+    segmentPieces.push(before);
+    visiblePieces.push(before);
+
+    const target = cleanCandidate(unwrapQuoted(match.destination) ?? match.destination);


Preserve trailing ')' in markdown image destinations

The markdown-image branch still normalizes parsed destinations with cleanCandidate(...), which strips trailing punctuation including ) and ]. For valid links that end in a balanced parenthesis (for example ![x](https://en.wikipedia.org/wiki/Test_(film))), the extracted mediaUrl is truncated to https://en.wikipedia.org/wiki/Test_(film, so downstream media delivery uses a broken URL even though parseMarkdownImageDestination already parsed the destination correctly.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-14T10:16:10Z

    if (!trimmedStart.startsWith("MEDIA:")) {
-      keptLines.push(line);
-      pushTextSegment(line);
+      const markdownImageResult = collectMarkdownImageSegments({ line, media });


Ignore markdown image syntax inside inline code spans

Non-MEDIA: lines are now always passed through markdown-image extraction, but this path only excludes fenced code blocks and does not respect inline backtick spans. As a result, literal examples like Use `![chart](https://example.com/chart.png)` syntax are converted into outbound media, which removes intended literal text and can send unintended images whenever replies discuss markdown syntax.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 86bdf95641

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-23T23:11:15Z

+    matches.length < MAX_MARKDOWN_IMAGE_MATCHES_PER_LINE &&
+    attempts < MAX_MARKDOWN_IMAGE_ATTEMPTS_PER_LINE
+  ) {
+    const index = line.indexOf("![", searchIndex);


Ignore escaped markdown image syntax

The scanner currently treats any ![ as an image opener without checking whether it was escaped, so inputs like \![chart](https://example.com/chart.png) are parsed as outbound media instead of literal text. In Markdown, a backslash-escaped image marker should render verbatim; with this logic we can unintentionally send media when a reply is trying to show syntax examples, and the visible text is also corrupted by leaving a stray backslash.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 69c654aa61

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-24T01:36:15Z

    return { text: "" };
  }
  const mayContainMediaToken = /media:/i.test(trimmedRaw);
+  const mayContainMarkdownImage = /!\[[^\]]*]\(/.test(trimmedRaw);


Broaden markdown-image precheck for nested alt brackets

The fast-path guard mayContainMarkdownImage only matches !\[[^\]]*]\(, so valid Markdown like ![a [b]](https://example.com/x.png) is treated as having no image and exits early with plain text. findMarkdownImageMatches can parse nested [] correctly, but this guard prevents it from running whenever there is no MEDIA: token or audio tag on the message. That leaves standards-compliant image replies unlifted to mediaUrl/mediaUrls in the common markdown-only case.

Useful? React with 👍 / 👎.

aisle-research-bot · 2026-04-24T01:38:31Z

🔒 Aisle Security Analysis

We found 2 potential security issue(s) in this PR:

#	Severity	Title
1	🟠 High	Potential SSRF via Markdown image URL extraction in splitMediaFromOutput
2	🟡 Medium	Algorithmic complexity DoS in markdown image parsing (splitMediaFromOutput)

1. 🟠 Potential SSRF via Markdown image URL extraction in splitMediaFromOutput

Property	Value
Severity	High
CWE	CWE-918
Location	`src/media/parse.ts:170-172`

Description

splitMediaFromOutput now extracts media URLs not only from explicit MEDIA: directives, but also from Markdown image syntax (![alt](destination)). The extracted destination is treated as media if it matches ^https?:// and passes isValidMedia(), which unconditionally returns true for any http(s) URL.

This introduces a new path for untrusted text to cause downstream components to treat arbitrary attacker-controlled URLs as media:

Input: untrusted agent/tool output text containing ![x](http://...) / ![x](https://...)
Normalization/cleaning: unwrapQuoted / cleanCandidate / normalizeMediaSource
Approval: isRemoteMarkdownImageMedia() which only checks scheme + isValidMedia()
Output: URL is appended to mediaUrls and emitted as a {type: "media"} segment

If downstream logic fetches, proxies, or otherwise dereferences mediaUrls (common for rendering, previewing, uploading, or safety scanning), an attacker can trigger SSRF to internal services (e.g., http://127.0.0.1, http://169.254.169.254, RFC1918 ranges) simply by embedding a Markdown image.

Vulnerable code:

const target = normalizeMediaSource(
  cleanCandidate(unwrapQuoted(match.destination) ?? match.destination),
);
if (isRemoteMarkdownImageMedia(target)) {
  params.media.push(target);
  lineSegments.push({ type: "media", url: target });
}

function isRemoteMarkdownImageMedia(candidate: string): boolean {
  return /^https?:\/\//i.test(candidate) && isValidMedia(candidate);
}

function isValidMedia(candidate: string) {
  ...
  if (/^https?:\/\//i.test(candidate)) {
    return true;
  }
  ...
}

Recommendation

Treat remote media URLs as untrusted and apply SSRF mitigations before accepting them as media.

Suggested hardening:

Parse and validate URLs using new URL() (reject invalid URLs, reject credentials).
Block private/loopback/link-local IP ranges and common internal hostnames.
Optionally enforce an allowlist of permitted domains for remote media.
Consider requiring an explicit MEDIA: directive for remote fetches, and treat Markdown images as display-only unless explicitly enabled.

Example:

import dns from "node:dns/promises";
import net from "node:net";

function isPublicIp(ip: string): boolean {
  // block loopback, link-local, RFC1918, etc. (implement fully)
  return !(
    ip.startsWith("127.") ||
    ip.startsWith("10.") ||
    ip.startsWith("192.168.") ||
    ip.startsWith("169.254.")
  );
}

async function isSafeRemoteMedia(urlStr: string): Promise<boolean> {
  let url: URL;
  try { url = new URL(urlStr); } catch { return false; }
  if (url.protocol !== "http:" && url.protocol !== "https:") return false;
  if (url.username || url.password) return false;

  // Resolve and verify IPs (beware DNS rebinding; do checks at connect-time too)
  const addrs = await dns.lookup(url.hostname, { all: true });
  if (addrs.some(a => net.isIP(a.address) && !isPublicIp(a.address))) return false;

  return true;
}

Apply isSafeRemoteMedia() (or equivalent) inside isRemoteMarkdownImageMedia before pushing the URL to mediaUrls.

2. 🟡 Algorithmic complexity DoS in markdown image parsing (splitMediaFromOutput)

Property	Value
Severity	Medium
CWE	CWE-400
Location	`src/media/parse.ts:293-326`

Description

splitMediaFromOutput() now attempts to extract remote media URLs from Markdown image syntax for every non-MEDIA: line when the overall output matches /!\[[^\]]*]\(/.

The per-line parser can be forced into high CPU usage with adversarially crafted text:

findMarkdownImageMatches() loops up to MAX_MARKDOWN_IMAGE_ATTEMPTS_PER_LINE (80) and for each candidate calls findMatchingBracket(), which may scan to the end of the line when brackets are unbalanced or deeply nested.
On failures, searchIndex only advances by index + 2, allowing many overlapping ![ candidates and repeated near-full-line scans.
There is no cap on total number of lines or total message size processed by splitMediaFromOutput(). An attacker can supply many lines near MAX_MARKDOWN_IMAGE_LINE_LENGTH (20k) containing repeated ![ sequences that trigger the expensive path, producing large cumulative CPU time.

This is a classic resource-exhaustion vector (algorithmic complexity) and may be reachable if raw contains untrusted user/model output in request/response handling paths.

Vulnerable code (hot loop + full-line scans):

while (matches.length < MAX_MARKDOWN_IMAGE_MATCHES_PER_LINE && attempts < MAX_MARKDOWN_IMAGE_ATTEMPTS_PER_LINE) {
  const index = line.indexOf("![", searchIndex);
  attempts += 1;
  const altEnd = findMatchingBracket(line, index + 2, "[", "]");
  ...
  searchIndex = index + 2;
}

Recommendation

Mitigate worst-case CPU and memory usage by adding global and more linear-time limits:

Add a cap on total input size / number of lines processed by splitMediaFromOutput() (or short-circuit Markdown image extraction when the input is very large).
Avoid repeated rescans by making the matcher single-pass (advance searchIndex more aggressively on failure, e.g. to altEnd + 1 when altEnd exists, or to the next ![ after the closing ] when possible).
Consider lowering the per-line attempt limit or stopping after a maximum number of total markdown-image parse operations per message.

Example defensive cap:

const MAX_OUTPUT_CHARS = 200_000;
const MAX_LINES = 2_000;
const safeRaw = trimmedRaw.slice(0, MAX_OUTPUT_CHARS);
const lines = safeRaw.split("\n").slice(0, MAX_LINES);

Analyzed PR: #66471 at commit 2b7bf89

_{Last updated on: 2026-04-24T02:42:18Z}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8cbea62427

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-24T02:31:02Z

@@ -185,8 +445,23 @@ export function splitMediaFromOutput(raw: string): {

    const trimmedStart = line.trimStart();
    if (!trimmedStart.toUpperCase().startsWith("MEDIA:")) {
-      keptLines.push(line);
-      pushTextSegment(line);
+      const markdownImageResult = collectMarkdownImageSegments({ line, media });


Skip markdown image parsing inside indented code blocks

splitMediaFromOutput only suppresses extraction for fenced code spans, then unconditionally runs collectMarkdownImageSegments on every other non-MEDIA: line. That means valid Markdown indented code blocks (e.g. lines starting with four spaces containing ![...](...) as a literal example) are treated as outbound media and the literal text is removed, which can send unintended images when replies include markdown syntax examples.

Useful? React with 👍 / 👎.

vincentkoc marked this pull request as ready for review April 14, 2026 09:50

vincentkoc self-assigned this Apr 14, 2026

openclaw-barnacle Bot added size: S maintainer Maintainer-authored PR labels Apr 14, 2026

chatgpt-codex-connector Bot reviewed Apr 14, 2026

View reviewed changes

Comment thread src/media/parse.ts

Comment thread src/media/parse.ts Outdated

greptile-apps Bot reviewed Apr 14, 2026

View reviewed changes

Comment thread src/media/parse.ts

chatgpt-codex-connector Bot reviewed Apr 14, 2026

View reviewed changes

Comment thread src/media/parse.ts Outdated

openclaw-barnacle Bot added size: M and removed size: S labels Apr 14, 2026

chatgpt-codex-connector Bot reviewed Apr 14, 2026

View reviewed changes

vincentkoc force-pushed the kitsune/telegram-group-media branch from 75a56f7 to 86bdf95 Compare April 23, 2026 23:07

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

vincentkoc force-pushed the kitsune/telegram-group-media branch 2 times, most recently from ccc1413 to 69c654a Compare April 24, 2026 01:32

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

vincentkoc force-pushed the kitsune/telegram-group-media branch from 69c654a to 8cbea62 Compare April 24, 2026 02:26

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

vincentkoc force-pushed the kitsune/telegram-group-media branch from 8cbea62 to 9674af7 Compare April 24, 2026 02:32

vincentkoc added 3 commits April 23, 2026 19:33

fix(reply): parse markdown image replies as media

976528f

fix(reply): preserve inline markdown image captions

f21b276

fix(reply): harden markdown image parsing

2b7bf89

vincentkoc force-pushed the kitsune/telegram-group-media branch from 9674af7 to 2b7bf89 Compare April 24, 2026 02:34

vincentkoc merged commit 60d892d into main Apr 24, 2026
12 checks passed

vincentkoc deleted the kitsune/telegram-group-media branch April 24, 2026 02:34

This was referenced Apr 27, 2026

[Bug]: Discord final replies attach Markdown image badges from assistant text #72642

Closed

[codex] Keep markdown image badges in Discord replies #72649

Closed

steipete mentioned this pull request Apr 27, 2026

fix(media): gate markdown image extraction by channel #72718

Merged

Uh oh!

Conversation

vincentkoc commented Apr 14, 2026

Summary

Root cause

Design

Validation

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 14, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

Uh oh!

EronFan commented Apr 14, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

aisle-research-bot Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔒 Aisle Security Analysis

Description

Recommendation

Description

Recommendation

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aisle-research-bot Bot commented Apr 24, 2026 •

edited

Loading