Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
CJK filenames in inbound Feishu messages still produce mojibake when delivered via the JSON file_name field (distinct from the Content-Disposition path fixed in #72388); the media.test.ts suite contains an explicit test that asserts the broken behavior.
Steps to reproduce
- Run any OpenClaw build with the Feishu channel configured (verified on installed
2026.4.5; behavior also present on current main, commit 6861d8a6, by inspection).
- From the Feishu app, send a file whose filename contains CJK characters (e.g.
武汉15座山登山信息汇总.csv or 何不同舟渡_2.txt) to a bot bridged through OpenClaw.
- Inspect the saved file under
~/.openclaw/media/inbound/.
Expected behavior
The saved filename preserves the original CJK characters, matching the behavior already guaranteed by recoverUtf8FileNameFromLatin1Header for the Content-Disposition path after #72388. A clean Latin-1 filename such as café-©.txt continues to be preserved unchanged (the helper rejects when recovery would produce U+FFFD and requires the recovered string to contain East Asian script).
Actual behavior
The file lands on disk with mojibake (UTF-8 bytes interpreted as Latin-1), e.g. 武汉15座山登山信息汇总.csv → æ¦æ±15座山ç»å±±ä¿¡æ¯æ±æ»¥.csv. The JSON path never reaches the existing recover helper.
extensions/feishu/src/media.test.ts:862 (on main) actively asserts this broken behavior:
it("keeps JSON-derived file_name metadata unchanged", async () => {
const fileName = "武汉15座山登山信息汇总.csv";
const latin1LookingFileName = Buffer.from(fileName, "utf8").toString("latin1");
messageResourceGetMock.mockResolvedValueOnce({
data: Buffer.from("fake-file-data"),
file_name: latin1LookingFileName,
});
const result = await downloadMessageResourceFeishu({ ... });
expect(result.fileName).toBe(latin1LookingFileName); // pinned mojibake
});
OpenClaw version
2026.4.5 (also verified by source inspection on main commit 6861d8a6)
Operating system
macOS 15 (Darwin 25.2.0)
Install method
npm global
Model
minimax-portal/MiniMax-M2.7
Provider / routing chain
openclaw -> minimax-portal
Additional provider/model setup details
Not relevant — this is a filename-encoding bug in the inbound media pipeline; no model traffic is involved.
Logs, screenshots, and evidence
# saved filename observed on disk (mojibake, pre-patch):
$ ls -la ~/.openclaw/media/inbound/ | tail -3
-rw------- 1 user staff 1613681 May 13 01:30 Agentå_ç_æ_å_å_¹è_-...---<uuid>.pdf
-rw------- 1 user staff 84406 May 13 01:41 ä¼_ä_å¾_ä_20260413-133035_2x---<uuid>.png
-rw------- 1 user staff 391395 May 13 01:31 æ_æ_codexæ_å_ç_ç_å_½ä_æ_ç_æ_äº_ä_å¼_å_¾_...---<uuid>.jpg
# saved filename observed after applying a local recover patch:
$ ls -la ~/.openclaw/media/inbound/ | grep 01:59
-rw------- 1 user staff 650261 May 13 01:59 微信图片_20260509120820_848_2166---<uuid>.png
# upstream main code paths (commit 6861d8a6):
# 1. extensions/feishu/src/bot-content.ts:312 — parseMediaKeys returns raw parsed.file_name
# 2. extensions/feishu/src/bot.ts (inbound resolveFeishuMediaList) — feeds to saveMediaBuffer with no recover
# 3. extensions/feishu/src/media.ts:183 — recoverUtf8FileNameFromLatin1Header exists but only used at line 203 (Content-Disposition path)
Impact and severity
Affected: every Feishu user uploading files with CJK / Hiragana / Katakana / Hangul filenames to OpenClaw-bridged bots. In our deployment (an enterprise multi-agent system) this is ~30% of inbound files.
Severity: Medium-High (files save successfully but become hard to identify, audit, and search by name).
Frequency: 100% reproducible for any non-ASCII CJK filename.
Consequence: User-facing filenames are unreadable; downstream Skills that key off filename (e.g. routing, knowledge-base ingestion) misbehave; users have to rename manually.
Additional information
This is filed per @vincentkoc's invite on #48388 ("If this still reproduces on current main with a different path, reply here and we can reopen or split it back out") — the original issue is now locked so opening a separate ticket.
Proposed fix scope (~10 LOC + test rewrite) — happy to send a PR if scope is confirmed:
- Apply
recoverUtf8FileNameFromLatin1Header (possibly renamed to drop Header since it no longer applies only to headers) to parsed.file_name in parseMediaKeys — or once at the saveMediaBuffer call site, which catches both result.fileName and mediaKeys.fileName in one place.
- Rewrite
media.test.ts:862 to assert the corrected behavior, plus add a guard case showing café-©.txt is still preserved unchanged.
Local workaround currently in production: patching dist/monitor-*.js with marker RnB-PATCH(2026-05-13). Verified working on 2026.4.5 — same 01:59 evidence above.
Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
CJK filenames in inbound Feishu messages still produce mojibake when delivered via the JSON
file_namefield (distinct from the Content-Disposition path fixed in #72388); themedia.test.tssuite contains an explicit test that asserts the broken behavior.Steps to reproduce
2026.4.5; behavior also present on currentmain, commit6861d8a6, by inspection).武汉15座山登山信息汇总.csvor何不同舟渡_2.txt) to a bot bridged through OpenClaw.~/.openclaw/media/inbound/.Expected behavior
The saved filename preserves the original CJK characters, matching the behavior already guaranteed by
recoverUtf8FileNameFromLatin1Headerfor the Content-Disposition path after #72388. A clean Latin-1 filename such ascafé-©.txtcontinues to be preserved unchanged (the helper rejects when recovery would produce U+FFFD and requires the recovered string to contain East Asian script).Actual behavior
The file lands on disk with mojibake (UTF-8 bytes interpreted as Latin-1), e.g.
武汉15座山登山信息汇总.csv→æ¦æ±15座山ç»å±±ä¿¡æ¯æ±æ»¥.csv. The JSON path never reaches the existing recover helper.extensions/feishu/src/media.test.ts:862(onmain) actively asserts this broken behavior:OpenClaw version
2026.4.5 (also verified by source inspection on
maincommit6861d8a6)Operating system
macOS 15 (Darwin 25.2.0)
Install method
npm global
Model
minimax-portal/MiniMax-M2.7
Provider / routing chain
openclaw -> minimax-portal
Additional provider/model setup details
Not relevant — this is a filename-encoding bug in the inbound media pipeline; no model traffic is involved.
Logs, screenshots, and evidence
Impact and severity
Affected: every Feishu user uploading files with CJK / Hiragana / Katakana / Hangul filenames to OpenClaw-bridged bots. In our deployment (an enterprise multi-agent system) this is ~30% of inbound files.
Severity: Medium-High (files save successfully but become hard to identify, audit, and search by name).
Frequency: 100% reproducible for any non-ASCII CJK filename.
Consequence: User-facing filenames are unreadable; downstream Skills that key off filename (e.g. routing, knowledge-base ingestion) misbehave; users have to rename manually.
Additional information
This is filed per @vincentkoc's invite on #48388 ("If this still reproduces on current main with a different path, reply here and we can reopen or split it back out") — the original issue is now locked so opening a separate ticket.
Proposed fix scope (~10 LOC + test rewrite) — happy to send a PR if scope is confirmed:
recoverUtf8FileNameFromLatin1Header(possibly renamed to dropHeadersince it no longer applies only to headers) toparsed.file_nameinparseMediaKeys— or once at thesaveMediaBuffercall site, which catches bothresult.fileNameandmediaKeys.fileNamein one place.media.test.ts:862to assert the corrected behavior, plus add a guard case showingcafé-©.txtis still preserved unchanged.Local workaround currently in production: patching
dist/monitor-*.jswith markerRnB-PATCH(2026-05-13). Verified working on 2026.4.5 — same01:59evidence above.