fix(feishu): sanitize markdown in streaming cards to prevent truncation#54899
fix(feishu): sanitize markdown in streaming cards to prevent truncation#54899Vicky-v7 wants to merge 3 commits intoopenclaw:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8e503b9601
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| result = result.replace(/<(?!\/?(?:a|b|i|em|strong|br|p|div|span|img|at|code|pre)\b)[^>\n]{0,80}(?!>)$/gm, (m) => { | ||
| return m.replace(/</g, "\\<"); |
There was a problem hiding this comment.
Exclude fenced code blocks from angle-bracket escaping
The new sanitizer only masks inline code (`...`) before running the angle-bracket regex, so fenced blocks are still rewritten. In practice, streamed code like if (x < 10) is transformed to if (x \< 10), which corrupts common code output and adds backslashes users did not generate. Because sanitizeCardKitMarkdown now runs on every streaming update, this will affect normal code responses that contain < inside fenced markdown blocks.
Useful? React with 👍 / 👎.
| // Interactive cards in reply context cause older Feishu clients to show | ||
| // "请升级至最新版本客户端" instead of the actual content. | ||
| if (replyToMessageId) { | ||
| return sendMessageFeishu({ cfg, to, text: cardText, replyToMessageId, replyInThread, accountId }); |
There was a problem hiding this comment.
Preserve mention formatting in reply fallback path
In reply mode, this function now routes through sendMessageFeishu, but it still precomputes mentions using buildMentionedCardContent (<at id=...></at>) and drops the original mentions argument. The post-message path expects text-style mention markup (or mentions so it can build it), so replies with mentions can render raw tags or fail to notify users. This regression only appears when both replyToMessageId and mentions are provided.
Useful? React with 👍 / 👎.
Greptile SummaryThis PR adds a Confidence Score: 2/5Not safe to merge without fixes: two concrete bugs would corrupt code-block rendering and break mention display in the reply path. The PR has two independent P1 issues. The fenced-block escaping bug directly corrupts rendered output for any streamed code block containing comparison operators, which is a very common LLM response pattern. The mention format mismatch silently breaks @-mentions in the reply fallback path. Both are regressions introduced by this PR and need targeted fixes before merging. Both changed files need attention: streaming-card.ts (fenced block protection in step 3 of sanitizeCardKitMarkdown) and send.ts (mention formatting in sendMarkdownCardFeishu).
|
| Filename | Overview |
|---|---|
| extensions/feishu/src/streaming-card.ts | Adds sanitizeCardKitMarkdown to balance backticks and escape angle brackets before streaming updates. Steps 1 and 2 (fence and inline backtick balancing) look correct, but step 3 (angle-bracket escaping) does not protect fenced code block contents, causing < inside code blocks to be incorrectly escaped to <. |
| extensions/feishu/src/send.ts | sendMarkdownCardFeishu now falls back to post format when replyToMessageId is set. When mentions are present the pre-formatted card-syntax string is passed to sendMessageFeishu, which expects post-format mention syntax — mentions will not render correctly in this reply path. |
Prompt To Fix All With AI
This is a comment left during a code review.
Path: extensions/feishu/src/streaming-card.ts
Line: 70-77
Comment:
**Fenced code blocks unprotected during angle-bracket escaping**
Step 3 replaces inline code spans with `\0CODE...CODE\0` markers before applying the angle-bracket regex, but it never protects the contents of triple-backtick fenced blocks. Any `<` inside a fenced block (e.g. a Python comparison `if x < y:`) will be matched by the regex and escaped to `\<`. In standard markdown, backslash escaping is disabled inside code blocks, so the rendered output will show the literal backslash — `if x \< y:` — corrupting the code display.
Minimal reproducer for `sanitizeCardKitMarkdown`:
```
```python
if x < 10:
print(x)
```
```
→ the `<` on the comparison line becomes `\<`.
The same `split(/(^```[\s\S]*?^```)/m)` pattern used in step 2 could protect fenced segments in step 3 (only apply the angle-bracket replace to non-fenced parts), or the inline-code placeholder pass could be extended to also wrap fenced blocks.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: extensions/feishu/src/send.ts
Line: 700-702
Comment:
**Mention format mismatch in reply fallback path**
When `mentions` is non-empty **and** `replyToMessageId` is set, the code first formats mentions with `buildMentionedCardContent`, which produces card-format syntax (`<at id=ou_xxx></at>` — unquoted `id` attribute, no display text). It then passes that pre-formatted string to `sendMessageFeishu`, which wraps it in a post-format `md` element. Post `md` expects text-format mentions: `<at user_id="ou_xxx">name</at>`.
The two formats are incompatible (different attribute names, quoting, and display text), so mentions will silently fail to render in the reply fallback path.
One fix: pass the original `text` and the `mentions` array to `sendMessageFeishu` instead of the pre-formatted `cardText`:
```typescript
if (replyToMessageId) {
return sendMessageFeishu({ cfg, to, text, mentions, replyToMessageId, replyInThread, accountId });
}
```
This lets `sendMessageFeishu` apply `buildMentionedMessage` (text format) on its own.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(feishu): use post format for replies..." | Re-trigger Greptile
| result = result.replace( | ||
| /`[^`]*`/g, // skip inline code | ||
| (match) => "\0CODE" + match + "CODE\0", | ||
| ); | ||
| result = result.replace(/<(?!\/?(?:a|b|i|em|strong|br|p|div|span|img|at|code|pre)\b)[^>\n]{0,80}(?!>)$/gm, (m) => { | ||
| return m.replace(/</g, "\\<"); | ||
| }); | ||
| result = result.replace(/\0CODE/g, "").replace(/CODE\0/g, ""); |
There was a problem hiding this comment.
Fenced code blocks unprotected during angle-bracket escaping
Step 3 replaces inline code spans with \0CODE...CODE\0 markers before applying the angle-bracket regex, but it never protects the contents of triple-backtick fenced blocks. Any < inside a fenced block (e.g. a Python comparison if x < y:) will be matched by the regex and escaped to \<. In standard markdown, backslash escaping is disabled inside code blocks, so the rendered output will show the literal backslash — if x \< y: — corrupting the code display.
Minimal reproducer for sanitizeCardKitMarkdown:
```python
if x < 10:
print(x)
→ the `<` on the comparison line becomes `\<`.
The same `split(/(^```[\s\S]*?^```)/m)` pattern used in step 2 could protect fenced segments in step 3 (only apply the angle-bracket replace to non-fenced parts), or the inline-code placeholder pass could be extended to also wrap fenced blocks.
<details><summary>Prompt To Fix With AI</summary>
`````markdown
This is a comment left during a code review.
Path: extensions/feishu/src/streaming-card.ts
Line: 70-77
Comment:
**Fenced code blocks unprotected during angle-bracket escaping**
Step 3 replaces inline code spans with `\0CODE...CODE\0` markers before applying the angle-bracket regex, but it never protects the contents of triple-backtick fenced blocks. Any `<` inside a fenced block (e.g. a Python comparison `if x < y:`) will be matched by the regex and escaped to `\<`. In standard markdown, backslash escaping is disabled inside code blocks, so the rendered output will show the literal backslash — `if x \< y:` — corrupting the code display.
Minimal reproducer for `sanitizeCardKitMarkdown`:
if x < 10:
print(x)→ the `<` on the comparison line becomes `\<`.
The same `split(/(^```[\s\S]*?^```)/m)` pattern used in step 2 could protect fenced segments in step 3 (only apply the angle-bracket replace to non-fenced parts), or the inline-code placeholder pass could be extended to also wrap fenced blocks.
How can I resolve this? If you propose a fix, please make it concise.
| if (replyToMessageId) { | ||
| return sendMessageFeishu({ cfg, to, text: cardText, replyToMessageId, replyInThread, accountId }); | ||
| } |
There was a problem hiding this comment.
Mention format mismatch in reply fallback path
When mentions is non-empty and replyToMessageId is set, the code first formats mentions with buildMentionedCardContent, which produces card-format syntax (<at id=ou_xxx></at> — unquoted id attribute, no display text). It then passes that pre-formatted string to sendMessageFeishu, which wraps it in a post-format md element. Post md expects text-format mentions: <at user_id="ou_xxx">name</at>.
The two formats are incompatible (different attribute names, quoting, and display text), so mentions will silently fail to render in the reply fallback path.
One fix: pass the original text and the mentions array to sendMessageFeishu instead of the pre-formatted cardText:
if (replyToMessageId) {
return sendMessageFeishu({ cfg, to, text, mentions, replyToMessageId, replyInThread, accountId });
}This lets sendMessageFeishu apply buildMentionedMessage (text format) on its own.
Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/feishu/src/send.ts
Line: 700-702
Comment:
**Mention format mismatch in reply fallback path**
When `mentions` is non-empty **and** `replyToMessageId` is set, the code first formats mentions with `buildMentionedCardContent`, which produces card-format syntax (`<at id=ou_xxx></at>` — unquoted `id` attribute, no display text). It then passes that pre-formatted string to `sendMessageFeishu`, which wraps it in a post-format `md` element. Post `md` expects text-format mentions: `<at user_id="ou_xxx">name</at>`.
The two formats are incompatible (different attribute names, quoting, and display text), so mentions will silently fail to render in the reply fallback path.
One fix: pass the original `text` and the `mentions` array to `sendMessageFeishu` instead of the pre-formatted `cardText`:
```typescript
if (replyToMessageId) {
return sendMessageFeishu({ cfg, to, text, mentions, replyToMessageId, replyInThread, accountId });
}
```
This lets `sendMessageFeishu` apply `buildMentionedMessage` (text format) on its own.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ea2faae29e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| result = result.replace( | ||
| /`[^`]*`/g, // skip inline code | ||
| (match) => "\0CODE" + match + "CODE\0", | ||
| ); |
There was a problem hiding this comment.
Mask inline code before escaping angle brackets
The inline-code protection in sanitizeCardKitMarkdown is ineffective because it wraps each span as "\0CODE" + match + "CODE\0" but leaves the original < characters inside match; the subsequent angle-bracket regex still sees and escapes them. For example, `x < y` is transformed into `x \< y`, which corrupts rendered code in normal streaming responses. This is a user-visible content regression for any inline code containing comparison operators.
Useful? React with 👍 / 👎.
ea2faae to
98ab012
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 98ab012e1c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| result = result.replace( | ||
| /`[^`]*`/g, // skip inline code | ||
| (match) => "\0CODE" + match + "CODE\0", | ||
| ); |
There was a problem hiding this comment.
Preserve inline code spans before escaping angle brackets
The inline-code masking step does not actually protect < inside code spans, because it wraps the original match ("\0CODE" + match + "CODE\0") without removing the angle bracket characters before the next regex runs. As a result, content like `x < y` is rewritten to `x \< y`, which is a user-visible regression in normal streamed responses that include inline comparisons.
Useful? React with 👍 / 👎.
| // Valid patterns we preserve: <br>, <b>, <i>, <a href=...>, <at ...>, etc. | ||
| // Everything else (e.g. `< foo`, `<3`, `x < y`) gets escaped. | ||
| result = result.replace( | ||
| /`[^`]*`/g, // skip inline code |
There was a problem hiding this comment.
Exclude fenced code blocks from angle-bracket escaping
The sanitizer only attempts to shield inline code via /`[^`]*`/g, so fenced blocks are still processed by the angle-bracket escape regex. In practice, a fenced snippet like if (x < 10) is transformed to include \<, corrupting generated code output for streaming responses that contain < inside triple-backtick blocks.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e0b33c2ab9
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| return `\x00CODESPAN${codeSpans.length - 1}\x00`; | ||
| }); | ||
| result = result.replace( | ||
| /<(?!\/?(?:a|b|i|em|strong|br|p|div|span|img|at|code|pre)\b)[^>\n]{0,80}(?!>)$/gm, |
There was a problem hiding this comment.
Remove length cap when escaping bare angle brackets
The sanitizer only escapes unmatched < when the remainder of the line is at most 80 characters because of [^>\n]{0,80}. Any longer single-line chunk (for example, x < ... with a long tail and no closing >) bypasses sanitization entirely, so Card Kit can still interpret it as an unclosed tag and truncate streaming output. Since this function is the commit’s primary mitigation for truncation, the hard 80-character cap makes the fix unreliable for common long LLM lines.
Useful? React with 👍 / 👎.
The previous regex `[^>\n]{0,80}` silently skipped lines longer than
80 characters, letting bare `<` pass through unsanitized. LLM output
routinely exceeds that length, so the streaming truncation fix was
unreliable for long lines. Replace with `[^>\n]*` to match any length.
Reported by Codex review on PR openclaw#54899.
|
Thanks for the review. Addressed in d3003ac: Issue 1 (fenced block angle-bracket escape): Fixed. Step 3 now extracts fenced Issue 2 (mention format mismatch in reply fallback): I believe this is a false positive — please double-check. At |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d3003ac994
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const inlineTicks = segment.match(/`/g); | ||
| if (inlineTicks && inlineTicks.length % 2 !== 0) { | ||
| // Append a closing backtick to balance the last unmatched one. | ||
| fencedParts[i] = segment + "`"; |
There was a problem hiding this comment.
Ignore escaped backticks when balancing inline spans
Counting inline backticks with segment.match(/`/g) treats escaped literals (for example \`` in Markdown docs) as unmatched code delimiters, so this branch appends an extra trailing backtick to otherwise valid text. In streaming cards, responses that explain Markdown escaping will be mutated (e.g., To type `becomesTo type `followed by an added `` ``), which is user-visible content corruption introduced by the sanitizer.
Useful? React with 👍 / 👎.
|
Addressed the Codex P2 in bb34fa2: step 2 now strips |
Rebased onto current main. Adds sanitizeCardKitMarkdown() to balance unclosed backtick fences/spans and escape bare angle brackets that cause Card Kit streaming to truncate output. Also strips leaked reasoning tags from streaming text and falls back to post format when replying to avoid older clients showing upgrade prompt.
bb34fa2 to
a59391a
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a59391ae9a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Sanitize markdown to prevent Card Kit streaming truncation (#26708) | ||
| text = sanitizeCardKitMarkdown(text); | ||
| const mergedInput = mergeStreamingText(this.pendingText ?? this.state.currentText, text); |
There was a problem hiding this comment.
Sanitize final close text before merging snapshots
update() now stores sanitized content in state.currentText, but the final close path still merges raw finalText from reply-dispatcher; when sanitization inserted escapes (for example x < 10), mergeStreamingText no longer sees overlap and appends the raw payload, producing duplicated text and reintroducing unsanitized < on the last Card Kit update. This makes the truncation fix unreliable in the exact cases it targets.
Useful? React with 👍 / 👎.
| const fencePattern = /^```/gm; | ||
| const fenceMatches = result.match(fencePattern); | ||
| if (fenceMatches && fenceMatches.length % 2 !== 0) { | ||
| result += "\n```"; |
There was a problem hiding this comment.
Recognize indented fenced code blocks when balancing
Fence balancing only counts lines that start at column 0 with ``````, so valid indented fences (common inside lists/quotes) are treated as normal text. If such a fence is unclosed, this logic does not append a closing fence and the inline-backtick pass adds only one backtick, leaving markdown unbalanced and still prone to Card Kit truncation for those replies.
Useful? React with 👍 / 👎.
| return `\x00CODEBLOCK${codeBlocks.length - 1}\x00`; | ||
| }); | ||
| const codeSpans: string[] = []; | ||
| result = result.replace(/`[^`\n]*`/g, (match) => { |
There was a problem hiding this comment.
Preserve multi-backtick inline code spans during escaping
Code-span masking uses /`[^`\n]*`/g, which only handles single-backtick delimiters. Markdown spans with multi-backtick delimiters (for example x < y) remain unmasked, so the angle-bracket sanitizer escapes < inside the code sample and visibly corrupts output for that valid markdown form.
Useful? React with 👍 / 👎.
The oxlint no-control-characters rule flags \x00 as unexpected control characters. Switch to Unicode Private Use Area codepoints \uE001/\uE002 which serve the same sentinel purpose without triggering the linter.
|
Rebased onto current main and fixed linter issues (replaced null byte placeholders with Unicode PUA codepoints, converted TODO to NOTE). All code-related checks are green now — remaining failures (security-fast, auto-reply, core) appear to be upstream CI infra issues unrelated to this PR. Ready for review! 🙏 |
Rebased onto current main. Adds clearText() to FeishuStreamingSession so the error handler can reset buffered text without closing the card. This allows upstream retry/failover logic to continue updating the same streaming card instead of producing a stale partial + a second final. Also includes sanitizeCardKitMarkdown from openclaw#54899 and stripReasoningTagsFromText from openclaw#54898 since this branch touches the same files.
|
Codex review: needs changes before merge. Summary Reproducibility: yes. for the PR blockers: a read-only sanitizer probe reproduces duplicated final text for Next step before merge Security Review findings
Review detailsBest possible solution: Keep the Feishu plugin fix, but make streaming state and Card Kit output use one canonical sanitized representation, cover unsafe tag-shaped text and valid multi-backtick spans, and add the required user-facing changelog entry. Do we have a high-confidence way to reproduce the issue? Yes for the PR blockers: a read-only sanitizer probe reproduces duplicated final text for Is this the best way to solve the issue? No. The PR direction is plausible, but sanitizing only the update path while close merges raw final text is not the narrowest maintainable fix; sanitizing consistently at the Feishu Card Kit output/state boundary is safer. Full review comments:
Overall correctness: patch is incorrect Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 14b5f73e2a5e. |
Summary
Closes #26708
When an LLM response streamed via Feishu Card Kit contains unmatched backticks or bare angle brackets, the message gets truncated. This is because Card Kit's incremental markdown parser treats unmatched opening backticks as starting a code span/block that never closes, and bare
<as unclosed HTML tags.Changes
sanitizeCardKitMarkdownfunction instreaming-card.ts:updateCardContentcallssanitizeCardKitMarkdownbefore sending to Card KittruncateSummarystrips backticks and angle brackets from notification summaryTest plan
if x < 10) — verify no truncation