fix(feishu): improve interactive card text extraction for Lark cards#32024
fix(feishu): improve interactive card text extraction for Lark cards#32024Elarwei001 wants to merge 16 commits intoopenclaw:mainfrom
Conversation
🔒 Aisle Security AnalysisWe found 2 potential security issue(s) in this PR:
1. 🔵 DoS via unbounded sanitization of interactive card text before output truncation
DescriptionThe new interactive-card text extraction introduces output caps (
Vulnerable code: const clean = sanitizeCardText(s);
const clipped = clean.slice(0, CARD_MAX_OUTPUT_CHARS - outChars);RecommendationClip before running regex-based sanitization, and/or enforce a hard maximum input size per field and per message. Example fix (clip then sanitize, keeping a small buffer to avoid cutting escape sequences mid-way): const MAX_FIELD_SCAN = 16_384; // or another defensible bound
const pushText = (s: string) => {
if (!s || outChars >= CARD_MAX_OUTPUT_CHARS) return;
// Bound work even if attacker provides multi-megabyte strings
const remaining = CARD_MAX_OUTPUT_CHARS - outChars;
const scanLen = Math.min(s.length, Math.max(remaining + 256, 0), MAX_FIELD_SCAN);
const clean = sanitizeCardText(s.slice(0, scanLen));
const clipped = clean.slice(0, remaining);
if (!clipped) return;
out.push(clipped);
outChars += clipped.length;
};Also consider applying a maximum length guard on 2. 🔵 CRLF/log injection via incomplete control-character sanitization in Feishu interactive card text
DescriptionThe new Feishu interactive-card text sanitization is intended to prevent terminal/log injection, but it fails to remove carriage return (
Vulnerable code: const CONTROL_CHAR_RE = /[\x00-\x08\x0b\x0c\x0e-\x1f\x7f]/g;
function sanitizeCardText(s: string): string {
return s.replace(ANSI_ESCAPE_RE, "").replace(CONTROL_CHAR_RE, "");
}RecommendationInclude Fix // keep \n and \t if desired, but strip \r as well
const CONTROL_CHAR_RE = /[\x00-\x08\x0b\x0c\x0d\x0e-\x1f\x7f]/g;Additionally, harden log output (recommended) so multi-line values cannot forge log entries: const safeForLog = (s: string) => s.replace(/[\r\n]/g, "\\n");
log(`... fetched quoted message: ${safeForLog(quotedContent ?? "").slice(0, 100)}`);This preserves card text for LLM context while preventing CR/LF log forging in operational logs. Analyzed PR: #32024 at commit Last updated on: 2026-03-07T07:06:55Z |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fddfc8790f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Greptile SummaryImproved Feishu interactive card text extraction to handle alert/notification cards. The PR extracts a new
The implementation is well-structured with defensive type checking. The fallback to JSON serialization is intentional per the inline comment, though it could produce large strings for complex cards. Recommendation: Add test coverage for the new element types ( Confidence Score: 4/5
Last reviewed commit: fddfc87 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0aba6253eb
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f02c22a0a4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4aece13866
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 52778458ff
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2c48a846f8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f99bcd114a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1ea586e47c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3c853749fd
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 49a2411718
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| // Legacy format: top-level "title" string (not inside a header object) | ||
| if (typeof card.title === "string" && card.title) { | ||
| texts.push(sanitizeCardText(card.title)); |
There was a problem hiding this comment.
Cap top-level card title before appending output
CARD_MAX_OUTPUT_CHARS is only enforced inside extractCardTextElements, but parseInteractiveCardContent appends card.title (and similarly header.subtitle.content) directly to texts without clipping. For interactive cards that carry unusually large title/subtitle strings, getMessageFeishu can still emit arbitrarily large quoted content, which defeats the new DoS/output-size guard and can bloat downstream prompt/log payloads.
Useful? React with 👍 / 👎.
|
Addressed all automated review feedback: Codex (9 threads): all resolved — schema 2.0 Aisle (CWE-400, CWE-117): Greptile: test coverage added for all new element types (11 tests passing); |
|
@aisle review |
|
@Takhoffman Could you please review this when you have a moment? Thanks. Btw, I don't know why aisle can't not work and re-review again, but already fix the logic it mentioned. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4388ffc233
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| ```json | ||
| { | ||
| "action": "download_message_file", |
There was a problem hiding this comment.
Remove unsupported
download_message_file action from skill doc
This skill update tells the agent to call {"action":"download_message_file"}, but the feishu_doc dispatcher does not implement that action (the switch (p.action) in extensions/feishu/src/docx.ts only handles read/write/list/upload/table/color actions), so following this guidance now deterministically returns Unknown action. In practice this breaks file-forwarding workflows by steering users/agents toward a non-existent API path.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 90120e24b4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| (inTag === "text" || inTag === "a" || inTag === "code_block") && | ||
| typeof inEl.text === "string" && | ||
| inEl.text.trim() | ||
| ) { | ||
| pushText(inEl.text.trim()); |
There was a problem hiding this comment.
Concatenate legacy inline row text before pushing output
In the legacy array-of-arrays path, each inline token in a single Feishu row is sent to pushText independently, but parseInteractiveCardContent later joins all collected pieces with newlines, so one rendered paragraph like PSM: my-service becomes PSM:\nmy-service. This changes message structure (and drops intentional inline spacing after .trim()), which can distort quoted alert context and key-value lines; aggregate each row’s inline text first, then append one row string.
Useful? React with 👍 / 👎.
| inTag === "at" && | ||
| typeof inEl.user_name === "string" && | ||
| inEl.user_name.trim() |
There was a problem hiding this comment.
Handle
at mentions that omit user_name
The at branch only extracts mention text when user_name is present, so interactive cards that provide only user_id/open_id will silently lose mention content in quoted replies. That drops important actor/assignee context for alerts and routing messages in environments where display names are unavailable; add a fallback to the ID fields instead of skipping the mention.
Useful? React with 👍 / 👎.
Alert and notification cards commonly use column_set, header, note, plain_text, and lark_md tags rather than div/markdown. The previous implementation only handled div and markdown, causing all alert cards to fall back to the useless '[Interactive Card]' placeholder. Changes: - Extract helper extractCardTextElements() with support for: div, markdown, plain_text, lark_md, header (title field), note (recursive), and column_set (recurse into columns[].elements) - Also extract the top-level card header title separately - When no text can be extracted, fall back to JSON.stringify(parsed) so the agent receives the full card structure instead of a placeholder
Three security issues reported by Aisle: 1. (High) JSON.stringify fallback leaked full card JSON (action payloads, URLs, IDs) into logs and LLM prompt context. Reverted to safe '[Interactive Card]' placeholder when no text is extractable. 2. (Medium) Unbounded JSON.stringify output on large cards. Removed the JSON.stringify fallback entirely. 3. (Medium) Unbounded recursive traversal allowed stack exhaustion and CPU/memory blowup on deeply-nested cards. Replaced recursion with iterative BFS bounded by CARD_MAX_NODES=500 and CARD_MAX_OUTPUT_CHARS=8000.
Feishu schema 2.0 cards (including those built by buildMarkdownCard) store content under card.body.elements rather than card.elements. The previous implementation only checked card.elements, causing all schema 2.0 alert cards to fall through to the '[Interactive Card]' placeholder. Now checks card.elements first (schema 1.0), then falls back to card.body.elements (schema 2.0).
Cover: plain_text, lark_md, header title, note (recursive), column_set, schema 2.0 (body.elements), and [Interactive Card] fallback for unrecognized-only elements.
…ractor Two DoS issues flagged by security review: 1. column_set with thousands of columns caused unbounded queue growth even though CARD_MAX_NODES limited processing — all columns were still iterated and enqueued before the node limit took effect. Fix: add CARD_MAX_QUEUED_ARRAYS=64 cap; enqueue() silently drops new arrays once the limit is reached; column_set loop breaks early on fanout. 2. queue.shift() is O(n) per call, so large queues had O(n²) total CPU. Fix: replace shift() with an index pointer (qi) for O(1) dequeue.
…E-117) Extracted text from interactive card elements is attacker-influenced and can contain ANSI escape sequences (CSI/OSC) or non-printable control characters that forge log lines or manipulate terminal output. Add sanitizeCardText() that strips ANSI CSI/OSC sequences and control characters (0x00-0x08, 0x0b-0x0c, 0x0e-0x1f, 0x7f), preserving \n and \t. Applied in pushText() before any text reaches the output buffer.
Agents can now download file attachments from Feishu messages (e.g.
forwarded .md/.pdf/.docx files) by providing the message_id and
file_key. The file is saved to disk and the local path is returned,
ready to pass into write/append actions.
Uses the existing downloadMessageResourceFeishu() helper which calls
/im/v1/messages/{message_id}/resources/{file_key}.
… card extractor Three issues from codex review: 1. (P1/P2) column_set loop iterated all el.columns without consuming the node budget, so a wide array of empty/malformed column objects still caused O(n) scan cost despite CARD_MAX_NODES. Fix: each column entry now increments seenNodes regardless of whether it is enqueueable. 2. (P2) FIFO queue caused nested content (note/column_set) to be emitted after later siblings, inverting document order. Fix: switch to LIFO stack (push/pop). Push column_set columns in reverse order so LIFO pop restores original left-to-right sequence.
…iagnosis Temporary debug log to capture the raw body.content when a quoted interactive card message is fetched. Helps diagnose why bot-sent cards show [Interactive Card] while forwarded cards extract correctly. Remove before final merge.
…ctor
Debug logs revealed two distinct card body formats:
1. Externally-forwarded cards: legacy format with elements as array-of-arrays
[{tag:'text',text:'...'},{tag:'a',...}] — not handled before, fell through
to [Interactive Card]. Also has a top-level 'title' string field.
2. Bot-sent cards quoted back: Feishu converts to a minimal render snapshot
(img tags + empty text) — content genuinely lost by Feishu, not fixable.
Fix for case 1:
- Detect Array.isArray(element) in the DFS loop and extract inline text tags
- Extract top-level card.title string in parseInteractiveCardContent
- Hoist pushText/enqueue above the while loop (were defined after first use)
- Add test for legacy array-of-arrays format
…tractor The legacy format uses 'content' (not 'elements') as the array-of-arrays field, per the official Feishu post message spec. Also support more inline tags: a (link text), at (user_name), code_block (text); skip img/media/ emotion/hr. Add 'content' as third fallback after elements and body.elements in parseInteractiveCardContent.
header.subtitle ({tag: plain_text|lark_md, content: ...}) was missing
from parseInteractiveCardContent — only title was extracted.
…udget; remove download_message_file - sanitize card.title and header.subtitle.content via sanitizeCardText (CWE-117) - consume seenNodes budget inside legacy inline row loop to prevent unbounded iteration - remove download_message_file action from feishu_doc tool (out of scope)
90120e2 to
81921e4
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 81921e4ff4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // (except \n and \t) from card text before it reaches logs or LLM context. | ||
| // Prevents log/terminal injection via attacker-influenced card payloads (CWE-117). | ||
| const ANSI_ESCAPE_RE = /\x1b(?:\[[0-?]*[ -/]*[@-~]|\][^\x07]*\x07)/g; | ||
| const CONTROL_CHAR_RE = /[\x00-\x08\x0b\x0c\x0e-\x1f\x7f]/g; |
There was a problem hiding this comment.
Strip carriage returns from sanitized card text
sanitizeCardText is intended to remove control characters except \n/\t, but CONTROL_CHAR_RE omits \x0d (carriage return). Interactive-card fields containing \r therefore pass through into quoted message content, which can still enable log/terminal line rewriting even after the CWE-117 hardening in this commit. This is most visible when attacker-influenced card text is surfaced in logs or TTY output.
Useful? React with 👍 / 👎.
|
This pull request has been automatically marked as stale due to inactivity. |
|
Closing due to inactivity. |
Fixes #32023.
When a bot is @mentioned in a Feishu group with a quoted interactive card, the previous implementation returned
[Interactive Card]for most card formats because it only handleddivandmarkdownelement tags.Background
Feishu has two distinct interactive card formats:
im.message.getreturns the complete JSON — readable by bots.im.message.getreturns an initial-state snapshot — not fixable at the API level (see feat(feishu): bot cannot read Cardkit-based interactive card content via im.message.get #32930).This PR addresses the first category. For card JSON structure reference, see Card JSON 2.0 Structure and Card JSON 1.0 Structure.
Changes
Fix
extractCardTextElements()helper with support for:div,markdown,plain_text,lark_md,header(title + subtitle),note(recursive),column_set(recurse intocolumns[].elements)card.body.elements) in addition to schema 1.0 (card.elements)card.contentas array-of-arrays); inline tags:text,a(link),at(mention → user_name),code_block; skipimg/media/emotion/hrcard.titlestring (present in some legacy formats)CARD_MAX_NODES=500,CARD_MAX_OUTPUT_CHARS=8000,CARD_MAX_QUEUED_ARRAYS=64to prevent DoS; index-based dequeue avoids O(n²)sanitizeCardText()to strip ANSI escape sequences and control characters (CWE-117)[Interactive Card]placeholder when no text is extractableTest
plain_text,lark_md,header,note,column_set, schema 2.0, legacy array-of-arrays format, and placeholder fallback