fix(feishu): recover UTF-8 filenames from Latin-1 encoded Content-Disposition by alex-xuweilong · Pull Request #48578 · openclaw/openclaw

alex-xuweilong · 2026-03-17T00:20:53Z

Summary

Fixes garbled Chinese/CJK filenames when receiving files via Feishu channel.

Root Cause

When the Feishu API returns Content-Disposition with a plain filename header (without the RFC 5987 filename*=UTF-8'' form), the HTTP client decodes raw UTF-8 bytes as Latin-1. Each 3-byte CJK character becomes 3 Latin-1 characters, producing mojibake.

Fix

Add a tryRecoverLatin1AsUtf8() helper that detects high-byte Latin-1 artifacts and re-decodes them as UTF-8 via Buffer. Applied to the plain filename extraction path. Falls back to the original string if recovery fails.

Changes

extensions/feishu/src/media.ts: 16 lines added, 1 removed

Fixes #48388

…position When the Feishu API returns Content-Disposition with filename="..." (without the RFC 5987 filename*=UTF-8'' form), the HTTP client may decode UTF-8 bytes as Latin-1, corrupting Chinese/CJK filenames (e.g. "何不同舟渡_2.txt" becomes "æµ_è_æ_ä_2.txt"). Add a tryRecoverLatin1AsUtf8 helper that detects high-byte Latin-1 artifacts and re-decodes them as UTF-8, applied to the plain filename="..." extraction path. Fixes openclaw#48388

greptile-apps · 2026-03-17T00:22:38Z

Greptile Summary

This PR adds a small, targeted heuristic (tryRecoverLatin1AsUtf8) to fix garbled CJK filenames when the Feishu API returns a plain filename="…" Content-Disposition header instead of the RFC 5987 filename*=UTF-8''… form. The HTTP client decodes the raw UTF-8 bytes as Latin-1, and the helper re-encodes those Latin-1 chars back to bytes and re-decodes as UTF-8, falling back to the original string if UTF-8 decoding produces replacement characters.

The \uFFFD guard is the correct way to detect a failed UTF-8 decode in Node.js and prevents false positives for most real Latin-1 content.
The empty catch {} is appropriate since the function always falls back to the original string on any error.
The change is narrowly scoped to the plainMatch path; the existing filename*=UTF-8'' (RFC 5987) path is unaffected.
One inherent limitation: a Latin-1 filename whose byte sequence happens to be valid UTF-8 (e.g. two Western-European characters forming a 2-byte UTF-8 sequence) would be silently rewritten. This is an unavoidable trade-off with any header-recovery heuristic and is a very unlikely scenario in Feishu's environment.

Confidence Score: 4/5

Safe to merge; the fix is minimal, well-guarded, and falls back gracefully.
The change is a single, self-contained helper with a correct fallback mechanism. The \uFFFD guard and try/catch block protect against incorrect recovery. The only deduction is the inherent heuristic ambiguity (valid Latin-1 that forms valid UTF-8), which is an accepted trade-off and very unlikely in this context.
No files require special attention.

_{Last reviewed commit: cd566af}

vincentkoc · 2026-04-26T22:17:33Z

ProjectClownfish could not safely update this branch, so it opened a narrow replacement PR instead.

Replacement PR: #72388
Source PR: #48578
Contributor credit is preserved in the replacement PR body and changelog plan.

openclaw-barnacle Bot added channel: feishu Channel integration: feishu size: XS labels Mar 17, 2026

This was referenced Mar 17, 2026

[Bug]: Feishu file names with Chinese characters are garbled (UTF-8 encoding issue) #48388

Closed

feat: centralized filename encoding utility for multi-encoding Content-Disposition handling #48788

Open

vincentkoc mentioned this pull request Apr 26, 2026

fix(feishu): recover mojibake filenames from Content-Disposition #72388

Merged

vincentkoc closed this Apr 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(feishu): recover UTF-8 filenames from Latin-1 encoded Content-Disposition#48578

fix(feishu): recover UTF-8 filenames from Latin-1 encoded Content-Disposition#48578
alex-xuweilong wants to merge 1 commit intoopenclaw:mainfrom
alex-xuweilong:fix/feishu-utf8-filename-decoding

alex-xuweilong commented Mar 17, 2026

Uh oh!

greptile-apps Bot commented Mar 17, 2026

Uh oh!

vincentkoc commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

alex-xuweilong commented Mar 17, 2026

Summary

Root Cause

Fix

Changes

Uh oh!

greptile-apps Bot commented Mar 17, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

vincentkoc commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants