fix(feishu): improve interactive card text extraction for all card formats#45936
fix(feishu): improve interactive card text extraction for all card formats#45936Br1an67 wants to merge 1 commit intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR extends Key findings:
Confidence Score: 3/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: extensions/feishu/src/send.ts
Line: 301-316
Comment:
**DFS stack growth is unbounded despite the DoS protection claim**
`CARD_MAX_NODES` only limits how many nodes are *popped and processed*; it does not limit how many children are *pushed* in a single iteration. A single `column_set` or `note` element can push an unbounded number of children onto `stack` before the `nodesVisited` guard ever runs again.
Concrete attack: a crafted card with `column_set.columns` containing 10,000 entries each holding 10,000 elements pushes 100 M items to `stack` in one loop iteration, causing out-of-memory before `CARD_MAX_NODES = 500` can protect anything. The card JSON itself would need to be large, but Feishu's per-message size limit (~128 KB) still allows pushing thousands of items per expansion.
Consider adding a hard cap on the number of children pushed per expansion, e.g.:
```ts
const MAX_PUSH_PER_EXPANSION = 100; // or tie to CARD_MAX_NODES
if (tag === "note" && Array.isArray(item.elements)) {
const children = (item.elements as unknown[]).slice(0, MAX_PUSH_PER_EXPANSION);
for (let i = children.length - 1; i >= 0; i--) {
stack.push(children[i]);
}
continue;
}
if (tag === "column_set" && Array.isArray(item.columns)) {
const cols = (item.columns as unknown[]).slice(0, MAX_PUSH_PER_EXPANSION);
for (let ci = cols.length - 1; ci >= 0; ci--) {
const col = cols[ci] as Record<string, unknown> | undefined;
if (col && Array.isArray(col.elements)) {
const els = (col.elements as unknown[]).slice(0, MAX_PUSH_PER_EXPANSION);
for (let ei = els.length - 1; ei >= 0; ei--) {
stack.push(els[ei]);
}
}
}
continue;
}
```
Or add a `stack.length` guard at the top of the while loop (e.g. `stack.length > CARD_MAX_NODES * 4`) to abort before memory is exhausted.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: extensions/feishu/src/send.ts
Line: 149-151
Comment:
**Misleading constant name**
`CARD_MAX_QUEUED_ARRAYS` is used as a limit on the number of *paragraphs* iterated in the legacy rich-text path — not on anything actually "queued". The name implies it bounds a queue/stack, which can mislead a future reader into thinking the legacy path is already stack-guarded. Consider renaming to `CARD_MAX_PARAGRAPHS` or `CARD_LEGACY_MAX_PARAGRAPHS` to match what the variable actually controls.
```suggestion
const CARD_MAX_NODES = 500;
const CARD_MAX_OUTPUT_CHARS = 8000;
const CARD_MAX_PARAGRAPHS = 64;
```
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 7866171 |
| if (tag === "note" && Array.isArray(item.elements)) { | ||
| for (let i = (item.elements as unknown[]).length - 1; i >= 0; i--) { | ||
| stack.push((item.elements as unknown[])[i]); | ||
| } | ||
| continue; | ||
| } | ||
| if (tag === "column_set" && Array.isArray(item.columns)) { | ||
| for (let ci = (item.columns as unknown[]).length - 1; ci >= 0; ci--) { | ||
| const col = (item.columns as unknown[])[ci] as Record<string, unknown> | undefined; | ||
| if (col && Array.isArray(col.elements)) { | ||
| for (let ei = (col.elements as unknown[]).length - 1; ei >= 0; ei--) { | ||
| stack.push((col.elements as unknown[])[ei]); | ||
| } | ||
| } | ||
| } | ||
| continue; |
There was a problem hiding this comment.
DFS stack growth is unbounded despite the DoS protection claim
CARD_MAX_NODES only limits how many nodes are popped and processed; it does not limit how many children are pushed in a single iteration. A single column_set or note element can push an unbounded number of children onto stack before the nodesVisited guard ever runs again.
Concrete attack: a crafted card with column_set.columns containing 10,000 entries each holding 10,000 elements pushes 100 M items to stack in one loop iteration, causing out-of-memory before CARD_MAX_NODES = 500 can protect anything. The card JSON itself would need to be large, but Feishu's per-message size limit (~128 KB) still allows pushing thousands of items per expansion.
Consider adding a hard cap on the number of children pushed per expansion, e.g.:
const MAX_PUSH_PER_EXPANSION = 100; // or tie to CARD_MAX_NODES
if (tag === "note" && Array.isArray(item.elements)) {
const children = (item.elements as unknown[]).slice(0, MAX_PUSH_PER_EXPANSION);
for (let i = children.length - 1; i >= 0; i--) {
stack.push(children[i]);
}
continue;
}
if (tag === "column_set" && Array.isArray(item.columns)) {
const cols = (item.columns as unknown[]).slice(0, MAX_PUSH_PER_EXPANSION);
for (let ci = cols.length - 1; ci >= 0; ci--) {
const col = cols[ci] as Record<string, unknown> | undefined;
if (col && Array.isArray(col.elements)) {
const els = (col.elements as unknown[]).slice(0, MAX_PUSH_PER_EXPANSION);
for (let ei = els.length - 1; ei >= 0; ei--) {
stack.push(els[ei]);
}
}
}
continue;
}Or add a stack.length guard at the top of the while loop (e.g. stack.length > CARD_MAX_NODES * 4) to abort before memory is exhausted.
Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/feishu/src/send.ts
Line: 301-316
Comment:
**DFS stack growth is unbounded despite the DoS protection claim**
`CARD_MAX_NODES` only limits how many nodes are *popped and processed*; it does not limit how many children are *pushed* in a single iteration. A single `column_set` or `note` element can push an unbounded number of children onto `stack` before the `nodesVisited` guard ever runs again.
Concrete attack: a crafted card with `column_set.columns` containing 10,000 entries each holding 10,000 elements pushes 100 M items to `stack` in one loop iteration, causing out-of-memory before `CARD_MAX_NODES = 500` can protect anything. The card JSON itself would need to be large, but Feishu's per-message size limit (~128 KB) still allows pushing thousands of items per expansion.
Consider adding a hard cap on the number of children pushed per expansion, e.g.:
```ts
const MAX_PUSH_PER_EXPANSION = 100; // or tie to CARD_MAX_NODES
if (tag === "note" && Array.isArray(item.elements)) {
const children = (item.elements as unknown[]).slice(0, MAX_PUSH_PER_EXPANSION);
for (let i = children.length - 1; i >= 0; i--) {
stack.push(children[i]);
}
continue;
}
if (tag === "column_set" && Array.isArray(item.columns)) {
const cols = (item.columns as unknown[]).slice(0, MAX_PUSH_PER_EXPANSION);
for (let ci = cols.length - 1; ci >= 0; ci--) {
const col = cols[ci] as Record<string, unknown> | undefined;
if (col && Array.isArray(col.elements)) {
const els = (col.elements as unknown[]).slice(0, MAX_PUSH_PER_EXPANSION);
for (let ei = els.length - 1; ei >= 0; ei--) {
stack.push(els[ei]);
}
}
}
continue;
}
```
Or add a `stack.length` guard at the top of the while loop (e.g. `stack.length > CARD_MAX_NODES * 4`) to abort before memory is exhausted.
How can I resolve this? If you propose a fix, please make it concise.| const CARD_MAX_NODES = 500; | ||
| const CARD_MAX_OUTPUT_CHARS = 8000; | ||
| const CARD_MAX_QUEUED_ARRAYS = 64; |
There was a problem hiding this comment.
Misleading constant name
CARD_MAX_QUEUED_ARRAYS is used as a limit on the number of paragraphs iterated in the legacy rich-text path — not on anything actually "queued". The name implies it bounds a queue/stack, which can mislead a future reader into thinking the legacy path is already stack-guarded. Consider renaming to CARD_MAX_PARAGRAPHS or CARD_LEGACY_MAX_PARAGRAPHS to match what the variable actually controls.
| const CARD_MAX_NODES = 500; | |
| const CARD_MAX_OUTPUT_CHARS = 8000; | |
| const CARD_MAX_QUEUED_ARRAYS = 64; | |
| const CARD_MAX_NODES = 500; | |
| const CARD_MAX_OUTPUT_CHARS = 8000; | |
| const CARD_MAX_PARAGRAPHS = 64; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/feishu/src/send.ts
Line: 149-151
Comment:
**Misleading constant name**
`CARD_MAX_QUEUED_ARRAYS` is used as a limit on the number of *paragraphs* iterated in the legacy rich-text path — not on anything actually "queued". The name implies it bounds a queue/stack, which can mislead a future reader into thinking the legacy path is already stack-guarded. Consider renaming to `CARD_MAX_PARAGRAPHS` or `CARD_LEGACY_MAX_PARAGRAPHS` to match what the variable actually controls.
```suggestion
const CARD_MAX_NODES = 500;
const CARD_MAX_OUTPUT_CHARS = 8000;
const CARD_MAX_PARAGRAPHS = 64;
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 786617156e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| for (let i = elementSources.length - 1; i >= 0; i--) { | ||
| stack.push(elementSources[i]); |
There was a problem hiding this comment.
Apply node limits before queuing all card elements
The CARD_MAX_NODES guard is only enforced in the DFS while condition, but this loop pre-queues every entry from elementSources first, so a card with an extremely large body.elements array will still force O(n) pushes and memory growth before the 500-node limit can help. Because schema 2.0 cards are newly traversed in this change, this introduces a new path where attacker-controlled interactive-card payloads can still trigger high CPU/memory usage despite the intended bound.
Useful? React with 👍 / 👎.
| if (totalChars >= CARD_MAX_OUTPUT_CHARS) break; | ||
| texts.push(fragment); | ||
| totalChars += fragment.length; |
There was a problem hiding this comment.
Clamp fragment length before appending card text
The output cap is checked before appending, but each fragment is appended in full and only then counted, so one oversized markdown/plain_text field can exceed CARD_MAX_OUTPUT_CHARS by a large margin. In large-card scenarios this returns far more text than intended to downstream processing and defeats the protection implied by the size limit.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 52ff58b9c4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| for (const inline of paragraph) { | ||
| if (totalChars >= maxChars) return fragments; |
There was a problem hiding this comment.
Cap legacy inline traversal independently of extracted chars
The loop over legacy card.content inlines only stops when totalChars >= maxChars, so cards with very large paragraphs of unsupported inline nodes (for example many img/unknown tags) are still scanned in full because totalChars never increases. This keeps an attacker-controlled O(n) CPU path in quoted interactive-card parsing even after adding bounds, so extractLegacyContentText needs a node/iteration limit similar to CARD_MAX_NODES.
Useful? React with 👍 / 👎.
| if (typeof card.title === "string" && card.title.trim()) { | ||
| texts.push(card.title); | ||
| totalChars += card.title.length; |
There was a problem hiding this comment.
Apply output cap before appending top-level card title
card.title is appended without checking CARD_MAX_OUTPUT_CHARS, so a single oversized title can bypass the intended 8k output bound before any other extraction logic runs. In that case parseInteractiveCardContent can still return arbitrarily large text to downstream processing despite the new size guard.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0eac15b637
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| for (let ci = (item.columns as unknown[]).length - 1; ci >= 0; ci--) { | ||
| const col = (item.columns as unknown[])[ci] as Record<string, unknown> | undefined; | ||
| if (col && Array.isArray(col.elements)) { | ||
| for (let ei = (col.elements as unknown[]).length - 1; ei >= 0; ei--) { | ||
| stack.push((col.elements as unknown[])[ei]); |
There was a problem hiding this comment.
Enforce push limits while expanding column_set children
In parseInteractiveCardContent, this column_set branch pushes every nested element onto stack without checking remaining node/stack budget, and the stackSizeLimit guard runs only on the next while iteration. A single interactive card with a very large columns[*].elements payload can therefore still force O(n) stack growth and high memory/CPU before the limit is applied, which undermines the DoS protection added in this commit.
Useful? React with 👍 / 👎.
| if (text) { | ||
| fragments.push(text); | ||
| totalChars += text.length; | ||
| } |
There was a problem hiding this comment.
Clamp legacy inline fragments to the output budget
extractLegacyContentText adds each extracted legacy inline string in full and only checks totalChars >= maxChars before the next node, so one oversized text/a/code_block fragment can exceed CARD_MAX_OUTPUT_CHARS by a large margin. This means interactive legacy cards can still return much more content than intended and negate the new output cap in large-fragment scenarios.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bcf3c047b3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (!Array.isArray(paragraph)) continue; | ||
| if (++queuedArrays > CARD_MAX_PARAGRAPHS) break; |
There was a problem hiding this comment.
Count every legacy content entry toward paragraph budget
In extractLegacyContentText, the paragraph budget is only incremented after Array.isArray(paragraph) passes, so a malformed content array with many non-array entries is scanned end-to-end with no top-level cap. This legacy-card path is new in this commit, so a crafted interactive payload can still force O(n) parsing work despite the added DoS bounds; consume budget before the type guard so all entries are bounded.
Useful? React with 👍 / 👎.
| } else if (nodeTag === "at" && typeof node.user_name === "string") { | ||
| text = `@${node.user_name}`; |
There was a problem hiding this comment.
Fallback to mention IDs when user_name is absent
Legacy at nodes are only rendered when user_name is present, so mentions that provide only user_id/open_id are dropped from extracted card text. That loses important participant context in quoted interactive cards and can degrade reply accuracy; use ID fields as a fallback when user_name is missing.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 29c7f00d22
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const remaining = CARD_MAX_OUTPUT_CHARS - totalChars; | ||
| if (remaining <= 0) break; | ||
| const clamped = fragment.length > remaining ? fragment.slice(0, remaining) : fragment; | ||
| texts.push(clamped); | ||
| totalChars += clamped.length; |
There was a problem hiding this comment.
Count newline separators in card output budget
The new CARD_MAX_OUTPUT_CHARS guard only tracks fragment text length, but the final output is built with texts.join("\n"), which adds one extra character between every fragment that is never budgeted. In cards with many small fragments (for example lots of short plain_text or legacy inline nodes), this can exceed the intended 8k cap by hundreds to ~1.5k characters, weakening the DoS bound this change introduces.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e0f29fc994
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| function sanitizeCardText(raw: string): string { | ||
| // Strip ANSI escape sequences and control characters (CWE-117) | ||
| // eslint-disable-next-line no-control-regex | ||
| return raw.replace(/\x1B\[[0-9;]*[A-Za-z]/g, "").replace(/[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]/g, ""); |
There was a problem hiding this comment.
Strip carriage returns in card text sanitization
sanitizeCardText removes ESC and most control characters but still allows \r, so the newly parsed interactive-card fields can carry carriage returns into quotedMsg.content; that value is logged directly in extensions/feishu/src/bot.ts (fetched quoted message: ...), which allows crafted quoted cards to forge or overwrite log lines despite the CWE-117 hardening in this commit. Normalize or strip \r (for example by converting CRLF to \n) before returning sanitized card text.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9e9f251ad4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (typeof card.title === "string" && card.title.trim()) { | ||
| const remaining = CARD_MAX_OUTPUT_CHARS - totalChars; | ||
| if (remaining > 0) { | ||
| const clamped = card.title.length > remaining ? card.title.slice(0, remaining) : card.title; |
There was a problem hiding this comment.
This extraction path only checks card.title and then traverses elements/body.elements, but standard Feishu card titles are typically under card.header.title.content, so cards with header-only text (or headers plus non-text body blocks) still lose their primary context and can fall back to [Interactive Card]. This means the new header support is incomplete for real interactive payloads and quoted-message understanding remains degraded in those cases.
Useful? React with 👍 / 👎.
…rmats (closes openclaw#32023) Extend parseInteractiveCardContent() to handle additional Feishu card formats that previously returned "[Interactive Card]" placeholder text: - Schema 2.0 cards (body.elements) in addition to 1.0 (elements) - New element tags: plain_text, lark_md, header (title+subtitle), note, column_set - Legacy rich-text content (array-of-arrays with text, a, at, code_block tags) - Top-level card.title string (some legacy formats) - DFS traversal bounded by CARD_MAX_NODES/CARD_MAX_OUTPUT_CHARS to prevent DoS - Sanitize extracted text to strip ANSI escapes and control chars (CWE-117) Add tests for all new element types: schema 2.0, header, lark_md, note, column_set, legacy array-of-arrays, and empty card fallback. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c26b9635c6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| id: string, | ||
| factory: ContextEngineFactory, | ||
| ): ContextEngineRegistrationResult { | ||
| return registerContextEngineForOwner(id, factory, PUBLIC_CONTEXT_ENGINE_OWNER); |
There was a problem hiding this comment.
Preserve core registration path for default context engine
registerContextEngine() now always registers with owner "public-sdk", so registerLegacyContextEngine() (which still calls this API with id "legacy") is rejected by the new default-slot owner check and silently leaves the default engine unregistered. Any normal agent run that calls ensureContextEnginesInitialized() and then resolveContextEngine() can now fail with Context engine "legacy" is not registered, which is a functional regression in the main runtime path.
Useful? React with 👍 / 👎.
|
Thanks for the context here. I swept through the related work, and this is now duplicate or superseded. Close this PR as superseded. The focused Feishu quoted/replied interactive-card extraction shipped through the maintainer replacement in #72397 and is documented in the v2026.4.26 changelog with credit to this contribution. The remaining broader parser gaps are now better handled by narrower follow-ups, while this branch carries unrelated generated/config and context-engine churn that should not be merged as-is. Best possible solution: Close this PR as superseded by the shipped #72397 replacement, keep the focused implementation already on main, and review narrower follow-ups such as #73203 or the broader inbound parser issues for any remaining Feishu card formats. What I checked:
So I’m closing this here and keeping the remaining discussion on the canonical linked item. Codex review notes: model gpt-5.5, reasoning high; reviewed against 189c91eae6e1; fix evidence: release v2026.4.26. |
Summary
[Interactive Card]placeholder instead of the actual card content.parseInteractiveCardContent()only handlesdivandmarkdownelement tags.parseInteractiveCardContent()inextensions/feishu/src/send.tsto support schema 2.0 cards (body.elements), additional element tags (plain_text,lark_md,header,note,column_set), legacy rich-textcontent(array-of-arrays withtext/a/at/code_blocktags), and top-levelcard.title. Added DFS traversal bounds (CARD_MAX_NODES=500, CARD_MAX_OUTPUT_CHARS=8000) to prevent DoS, andsanitizeCardText()to strip ANSI escapes/control chars (CWE-117).Change Type
Scope
Linked Issue/PR
Security Impact
Repro + Verification
Steps
header,column_set,plain_textelements) to a groupExpected
Actual (before fix)
[Interactive Card]placeholderEvidence
send.test.tscovering: schema 2.0, header/lark_md/note, column_set, legacy rich-text, and empty card fallbackHuman Verification
Review Conversations
Compatibility / Migration
divandmarkdownextraction paths unchangedFailure Recovery
extensions/feishu/src/send.ts,extensions/feishu/src/send.test.tsRisks and Mitigations
None — purely additive text extraction logic with bounds to prevent DoS.
This PR was AI-assisted (fully tested with unit tests, pre-existing build tooling issues prevent full pnpm build/check locally).