🐛 fix(local-system): guard readFile against binary blobs and oversized output by arvinxx · Pull Request #14602 · lobehub/lobehub

arvinxx · 2026-05-09T16:35:55Z

Summary

lobe-local-system.readFile previously had no guard against binary content or oversized output. Reading a 27KB base64-encoded blob blew the next LLM call up to 3.28M tokens, 416s, and a DB rollback (LOBE-8703 incident op_1778337437659_…). Adds four layers of protection in readLocalFile:

Extension hard-reject: anything outside the text-readable + special-parser whitelist returns a structured error pointing the agent at runCommand (file/hexdump/strings) instead of strong-reading via TextLoader.
Binary content sniff: read the first 8KB; refuse if it contains a null byte (and isn't UTF-16) or >30% of decoded chars are control / U+FFFD. Skipped for files that go through dedicated parsers (pdf/docx/xls/xlsx/pptx).
File size cap: 10MB hard limit checked via stat before opening the file.
Output caps: per-line cap of 8K chars (so a 27KB single base64 line can't bypass loc=[0,200]) and total cap of 500K chars, with new truncated / linesTruncated flags on ReadFileResult.

Also: expanded TEXT_READABLE_FILE_TYPES to cover commonly missed source / config extensions (cjs, mts, cts, jsonc, json5, env, properties, gradle, lua, dart, scala, groovy), exported a new isReadableFileType helper, and updated the readFile manifest description so the LLM knows the rules up front.

WriteFile render — match EditLocalFile

While the readFile guard work was in flight, WriteFile's renderer still used a syntax-highlighted preview while EditLocalFile used a unified diff via PatchDiff. Code-type new files now synthesize a `--- /dev/null` / `+++ b/` patch and render through the same PatchDiff component, so a Write and an Edit in the conversation feel visually consistent. Markdown keeps its rendered preview because a rendered doc reads better than an all-green diff.

Test refactor — readFile / readFiles end-to-end

The LocalFileCtr.readFile / readFiles tests in apps/desktop/src/main/controllers/__tests__/LocalFileCtr.test.ts deep-mocked node:fs/promises and @lobechat/file-loaders. Since the controller is a thin pass-through to readLocalFile, the assertions ended up testing shell internals (already covered in packages/local-file-shell), and broke the moment readLocalFile gained the new pre-flight checks above. Moved them into a sibling LocalFileCtr.readFile.test.ts that runs against a real tmpdir + real file-loaders, so adding more upstream guards no longer requires touching this suite.

Refs LOBE-8703.

Test plan

`bunx vitest run packages/file-loaders` (59 tests pass, including 10 new sniff tests)
`bunx vitest run packages/local-file-shell` (91 tests pass, including 9 new readFile guard tests)
`bunx vitest run packages/builtin-tool-local-system` (59 tests pass)
`bunx vitest run apps/desktop/src/main/controllers/tests/LocalFileCtr.test.ts apps/desktop/src/main/controllers/tests/LocalFileCtr.readFile.test.ts` (60 + 5 tests pass)
`bun run type-check` — no new errors from this change
Manual smoke against the original incident path (`readFile('/tmp/cm.bundle.b64')`) — expect immediate structured rejection
Manual: trigger a WriteFile in the agent UI and confirm the render shows a green new-file diff via PatchDiff (Markdown still renders as Markdown)

🤖 Generated with Claude Code

vercel · 2026-05-09T16:36:00Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
lobehub	Ready	Preview, Comment	May 9, 2026 5:51pm

sourcery-ai

We've reviewed this pull request using the Sourcery rules engine

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1baacbf7d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

codecov · 2026-05-09T16:43:10Z

Codecov Report

❌ Patch coverage is 90.52632% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.85%. Comparing base (25ee822) to head (e37ecda).

Additional details and impacted files

@@           Coverage Diff            @@
##           canary   #14602    +/-   ##
========================================
  Coverage   65.85%   65.85%            
========================================
  Files        2891     2893     +2     
  Lines      250433   250506    +73     
  Branches    30045    29205   -840     
========================================
+ Hits       164914   164976    +62     
- Misses      85369    85379    +10     
- Partials      150      151     +1

Flag	Coverage Δ
app	`60.07% <ø> (-0.01%)`	⬇️
database	`91.80% <ø> (ø)`
packages/agent-runtime	`80.48% <ø> (ø)`
packages/builtin-tool-lobe-agent	`83.41% <ø> (ø)`
packages/context-engine	`84.00% <ø> (ø)`
packages/conversation-flow	`92.43% <ø> (ø)`
packages/file-loaders	`87.60% <90.52%> (+<0.01%)`	⬆️
packages/memory-user-memory	`74.74% <ø> (ø)`
packages/model-bank	`99.94% <ø> (ø)`
packages/model-runtime	`83.73% <ø> (ø)`
packages/prompts	`70.31% <ø> (ø)`
packages/python-interpreter	`92.90% <ø> (ø)`
packages/ssrf-safe-fetch	`0.00% <ø> (ø)`
packages/types	`4.86% <ø> (ø)`
packages/utils	`88.02% <ø> (ø)`
packages/web-crawler	`88.29% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
Store	`66.99% <ø> (ø)`
Services	`53.98% <ø> (ø)`
Server	`70.85% <ø> (-0.01%)`	⬇️
Libs	`55.22% <ø> (ø)`
Utils	`82.51% <ø> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…d output Previously `lobe-local-system.readFile` would happily decode any extension as UTF-8 and return the entire content. Reading a 27KB base64-encoded git bundle blew up the next LLM call to 3.28M tokens / 416s and triggered a DB rollback. The default 200-line cap was bypassed because base64 was a single very long line. Add four layers of protection in `readLocalFile`: - Hard-reject extensions outside the text-readable + special-parser whitelist with a structured error pointing the agent at runCommand. - Sniff the first 8KB and refuse files that look binary (null bytes or >30% non-printable chars). - 10MB hard size cap before the file is read into memory. - Cap each returned line at 8K chars and total output at 500K chars, with `truncated` / `linesTruncated` flags surfaced in the result. Refs LOBE-8703. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ry sniffer The binary sniffer rejected UTF-16LE/BE files that lacked a BOM because their alternating 0x00 bytes tripped the null-byte heuristic. `TextLoader` already has a `detectUtf16NoBom` heuristic for these Windows-style exports; extract it to a shared `detectUtf16` util and run it in the sniffer before the null-byte check, decoding with the matching variant for the printable ratio test instead of declaring the file binary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Switch the WriteFile render from a syntax-highlighted preview to a synthesized "new file" unified diff via PatchDiff, matching the EditLocalFile visual. Markdown files keep their rendered preview. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous LocalFileCtr.readFile / readFiles tests deep-mocked node:fs/promises and @lobechat/file-loaders. Since the controller is a thin pass-through to readLocalFile, the assertions ended up testing shell internals (already covered in packages/local-file-shell), and broke as soon as readLocalFile gained new pre-flight checks. Move them into a sibling LocalFileCtr.readFile.test.ts that runs against a real tmpdir + real file-loaders, so adding more upstream guards no longer requires touching this suite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…d output (lobehub#14602) * 🐛 fix(local-system): guard readFile against binary blobs and oversized output Previously `lobe-local-system.readFile` would happily decode any extension as UTF-8 and return the entire content. Reading a 27KB base64-encoded git bundle blew up the next LLM call to 3.28M tokens / 416s and triggered a DB rollback. The default 200-line cap was bypassed because base64 was a single very long line. Add four layers of protection in `readLocalFile`: - Hard-reject extensions outside the text-readable + special-parser whitelist with a structured error pointing the agent at runCommand. - Sniff the first 8KB and refuse files that look binary (null bytes or >30% non-printable chars). - 10MB hard size cap before the file is read into memory. - Cap each returned line at 8K chars and total output at 500K chars, with `truncated` / `linesTruncated` flags surfaced in the result. Refs LOBE-8703. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * 🐛 fix(file-loaders): preserve UTF-16 text files without a BOM in binary sniffer The binary sniffer rejected UTF-16LE/BE files that lacked a BOM because their alternating 0x00 bytes tripped the null-byte heuristic. `TextLoader` already has a `detectUtf16NoBom` heuristic for these Windows-style exports; extract it to a shared `detectUtf16` util and run it in the sniffer before the null-byte check, decoding with the matching variant for the printable ratio test instead of declaring the file binary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * 💄 style(local-system): render WriteFile new files as a unified diff Switch the WriteFile render from a syntax-highlighted preview to a synthesized "new file" unified diff via PatchDiff, matching the EditLocalFile visual. Markdown files keep their rendered preview. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ✅ test(local-system): exercise readFile / readFiles end-to-end The previous LocalFileCtr.readFile / readFiles tests deep-mocked node:fs/promises and @lobechat/file-loaders. Since the controller is a thin pass-through to readLocalFile, the assertions ended up testing shell internals (already covered in packages/local-file-shell), and broke as soon as readLocalFile gained new pre-flight checks. Move them into a sibling LocalFileCtr.readFile.test.ts that runs against a real tmpdir + real file-loaders, so adding more upstream guards no longer requires touching this suite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…d output (#14602) * 🐛 fix(local-system): guard readFile against binary blobs and oversized output Previously `lobe-local-system.readFile` would happily decode any extension as UTF-8 and return the entire content. Reading a 27KB base64-encoded git bundle blew up the next LLM call to 3.28M tokens / 416s and triggered a DB rollback. The default 200-line cap was bypassed because base64 was a single very long line. Add four layers of protection in `readLocalFile`: - Hard-reject extensions outside the text-readable + special-parser whitelist with a structured error pointing the agent at runCommand. - Sniff the first 8KB and refuse files that look binary (null bytes or >30% non-printable chars). - 10MB hard size cap before the file is read into memory. - Cap each returned line at 8K chars and total output at 500K chars, with `truncated` / `linesTruncated` flags surfaced in the result. Refs LOBE-8703. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * 🐛 fix(file-loaders): preserve UTF-16 text files without a BOM in binary sniffer The binary sniffer rejected UTF-16LE/BE files that lacked a BOM because their alternating 0x00 bytes tripped the null-byte heuristic. `TextLoader` already has a `detectUtf16NoBom` heuristic for these Windows-style exports; extract it to a shared `detectUtf16` util and run it in the sniffer before the null-byte check, decoding with the matching variant for the printable ratio test instead of declaring the file binary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * 💄 style(local-system): render WriteFile new files as a unified diff Switch the WriteFile render from a syntax-highlighted preview to a synthesized "new file" unified diff via PatchDiff, matching the EditLocalFile visual. Markdown files keep their rendered preview. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ✅ test(local-system): exercise readFile / readFiles end-to-end The previous LocalFileCtr.readFile / readFiles tests deep-mocked node:fs/promises and @lobechat/file-loaders. Since the controller is a thin pass-through to readLocalFile, the assertions ended up testing shell internals (already covered in packages/local-file-shell), and broke as soon as readLocalFile gained new pre-flight checks. Move them into a sibling LocalFileCtr.readFile.test.ts that runs against a real tmpdir + real file-loaders, so adding more upstream guards no longer requires touching this suite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…d output (lobehub#14602) * 🐛 fix(local-system): guard readFile against binary blobs and oversized output Previously `lobe-local-system.readFile` would happily decode any extension as UTF-8 and return the entire content. Reading a 27KB base64-encoded git bundle blew up the next LLM call to 3.28M tokens / 416s and triggered a DB rollback. The default 200-line cap was bypassed because base64 was a single very long line. Add four layers of protection in `readLocalFile`: - Hard-reject extensions outside the text-readable + special-parser whitelist with a structured error pointing the agent at runCommand. - Sniff the first 8KB and refuse files that look binary (null bytes or >30% non-printable chars). - 10MB hard size cap before the file is read into memory. - Cap each returned line at 8K chars and total output at 500K chars, with `truncated` / `linesTruncated` flags surfaced in the result. Refs LOBE-8703. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * 🐛 fix(file-loaders): preserve UTF-16 text files without a BOM in binary sniffer The binary sniffer rejected UTF-16LE/BE files that lacked a BOM because their alternating 0x00 bytes tripped the null-byte heuristic. `TextLoader` already has a `detectUtf16NoBom` heuristic for these Windows-style exports; extract it to a shared `detectUtf16` util and run it in the sniffer before the null-byte check, decoding with the matching variant for the printable ratio test instead of declaring the file binary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * 💄 style(local-system): render WriteFile new files as a unified diff Switch the WriteFile render from a syntax-highlighted preview to a synthesized "new file" unified diff via PatchDiff, matching the EditLocalFile visual. Markdown files keep their rendered preview. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ✅ test(local-system): exercise readFile / readFiles end-to-end The previous LocalFileCtr.readFile / readFiles tests deep-mocked node:fs/promises and @lobechat/file-loaders. Since the controller is a thin pass-through to readLocalFile, the assertions ended up testing shell internals (already covered in packages/local-file-shell), and broke as soon as readLocalFile gained new pre-flight checks. Move them into a sibling LocalFileCtr.readFile.test.ts that runs against a real tmpdir + real file-loaders, so adding more upstream guards no longer requires touching this suite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…eged should be called before app is ready` Two-part fix for a regression where reading any text/JSON/source file via the local-system `readFile` tool surfaced an Electron protocol error in the response content. The error fired *after* `stat()` succeeded (so missing-file ENOENT was unaffected), making it look like the file couldn't be parsed. ## Root cause Stack trace (instrumented `read.ts` to capture it): ``` Error: protocol.registerSchemesAsPrivileged should be called before app is ready at new App (apps/desktop/dist/main/index.js:105339:21) at Module.<anonymous> (apps/desktop/dist/main/index.js:105615:11) at Module._compile (...) ``` `Module._compile` on `dist/main/index.js` means the main bundle is being freshly evaluated as a CJS module — re-running its top-level `var app = new App(); …; app.bootstrap();` after the real Electron-launched App was already ready. Triggering chain: agent calls `readFile` → main runs `loadFile(path)` from `@lobechat/file-loaders` → `getFileLoader('txt')` → `await import('./text')`. The lazy text-loader chunk back-references the main bundle for the shared util `detectUtf16NoBom`: ```js // dist/main/text-Cbmlmtca.js const require_index = require("./index.js"); // ← re-evaluates main … const variant = require_index.detectUtf16NoBom(buffer); ``` Electron's main entry is not in Node's CJS module cache (it's bootstrapped separately), so this `require("./index.js")` triggers a fresh compile of the main bundle — re-running `new App()` and `protocol.registerSchemesAsPrivileged` *after* `app.whenReady()`, which is illegal per Electron's API contract. Introduced by #14602 (`fix(local-system): guard readFile against binary blobs and oversized output`): adding `isBinaryContent.ts` made `detectUtf16NoBom` shared between the main bundle (via `sniffBinaryFile`) and the lazy text chunk, so rolldown placed it in main and rewrote the text chunk's call as a `require_index.detectUtf16NoBom`. Identical class of bug previously fixed for the `debug` package in #11827. ## Fix 1. **`packages/file-loaders/src/loaders/index.ts`** — TextLoader was lazy-imported for no real benefit. It's a 10KB module whose only deps are `node:fs/promises` and a tiny utf-16 detect util — nothing like the multi-MB parsers (pdfjs-dist, xlsx, mammoth) that the lazy pattern was designed for. Make it a static import; `getFileLoader('txt')` returns it synchronously. Result: the text chunk disappears entirely, removing this back-reference at the source. 2. **`apps/desktop/electron.vite.config.ts`** — defensive `manualChunks` rules so any future shared symbol doesn't recreate the same trap: - `vendor-file-loaders-utils` for the three small text/binary detection utils (`detectUtf16` / `isBinaryContent` / `isTextReadableFile`). Explicitly enumerated to avoid catching `parser-utils.ts`, which pulls in xmldom/yauzl/concat-stream (≈900KB) and belongs in the docx/pptx chunks instead. - `vendor-jszip` for JSZip — same root cause for `.docx` reads: the docx chunk had `require_index.require_lib()` (JSZip) back-referencing main. Both ends now share the vendor chunk; no main re-eval. Follows the project precedent set by #11827 for `debug`. ## Verification (live Electron via CDP) Bundle inventory before/after: | Chunk | Before | After | | --- | --- | --- | | `text-*.js` | 9.7KB (back-refs main) | (gone, inlined into main) | | `vendor-file-loaders-utils-*.js` | n/a | 18KB | | `vendor-jszip-*.js` | n/a | 899KB | | `docx-*.js` back-refs | `require_index.require_lib` | none | End-to-end via `tool.invokeBuiltinTool('lobe-local-system', 'readFile', …)`: | File | Before | After | | --- | --- | --- | | `.md` / `.json` / `.ts` | `Error accessing or processing file: protocol.registerSchemesAsPrivileged should be called before app is ready` | real file content | `grep -o 'require_index\\.[a-zA-Z_]*' dist/main/*-*.js | sort -u` → empty. All 61 file-loaders tests pass; all 64 builtin-tool-local-system tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…o /client (#14888) * 🐛 fix(local-system): forward all grepContent params + move executor to /client The local-system executor was reducing the agent's full grepContent params ({pattern, glob, output_mode, -i/-n/-A/-B/-C, multiline, head_limit, type, scope, ...}) down to {directory, pattern} before handing them to the runtime. `directory` isn't recognized by the IPC layer (which expects path/scope), so cwd silently fell back to process.cwd() (= apps/desktop/ in dev), and with glob/-i/output_mode all stripped grep matched anything containing the pattern across the whole tree — explaining LOBE-8666's dist/main/index.js + tsconfig.tsbuildinfo leaks. Also audited the rest of the executor layer: - listFiles: forward `limit` (was silently dropped → manifest default of 100 always won). - getCommandOutput: forward `filter` (was silently dropped → no regex filter ever applied to streamed output). - runCommand: mirror `run_in_background` → `background` so ComputerRuntime.RunCommandState.isBackground reflects reality (the IPC handler reads run_in_background directly, so the command itself ran in background — only the state field was wrong). Structure: moved src/executor/ → src/client/executor/ to match the other builtin-tool packages (task / lobe-agent / knowledge-base) and consolidate renderer-only code under /client. Dropped the `./executor` package subpath; consumers now import from `…/client`. Defensive: also added a resolveSearchPath helper in apps/desktop's contentSearch module that reads params.scope as a fallback for params.path, so any non-executor caller (direct IPC, future Gateway path) that passes `scope` still gets routed correctly instead of falling through to process.cwd(). Regression coverage: - grepContent full forwarding (LOBE-8666 case + all optional flags) - listFiles.limit forwarding - getCommandOutput.filter forwarding - runCommand.run_in_background → background mirror - resolveSearchPath fallback semantics (3 cases in base.test.ts) Verified end-to-end via Electron CDP — tool.invokeBuiltinTool with the LOBE-8666 params returns 9 clean .ts matches (no dist/, no .tsbuildinfo); listFiles {limit:3} returns 3 files (totalCount 10); runCommand {run_in_background:true} reports state.isBackground=true. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * 🐛 fix(desktop): readFile fails with `protocol.registerSchemesAsPrivileged should be called before app is ready` Two-part fix for a regression where reading any text/JSON/source file via the local-system `readFile` tool surfaced an Electron protocol error in the response content. The error fired *after* `stat()` succeeded (so missing-file ENOENT was unaffected), making it look like the file couldn't be parsed. ## Root cause Stack trace (instrumented `read.ts` to capture it): ``` Error: protocol.registerSchemesAsPrivileged should be called before app is ready at new App (apps/desktop/dist/main/index.js:105339:21) at Module.<anonymous> (apps/desktop/dist/main/index.js:105615:11) at Module._compile (...) ``` `Module._compile` on `dist/main/index.js` means the main bundle is being freshly evaluated as a CJS module — re-running its top-level `var app = new App(); …; app.bootstrap();` after the real Electron-launched App was already ready. Triggering chain: agent calls `readFile` → main runs `loadFile(path)` from `@lobechat/file-loaders` → `getFileLoader('txt')` → `await import('./text')`. The lazy text-loader chunk back-references the main bundle for the shared util `detectUtf16NoBom`: ```js // dist/main/text-Cbmlmtca.js const require_index = require("./index.js"); // ← re-evaluates main … const variant = require_index.detectUtf16NoBom(buffer); ``` Electron's main entry is not in Node's CJS module cache (it's bootstrapped separately), so this `require("./index.js")` triggers a fresh compile of the main bundle — re-running `new App()` and `protocol.registerSchemesAsPrivileged` *after* `app.whenReady()`, which is illegal per Electron's API contract. Introduced by #14602 (`fix(local-system): guard readFile against binary blobs and oversized output`): adding `isBinaryContent.ts` made `detectUtf16NoBom` shared between the main bundle (via `sniffBinaryFile`) and the lazy text chunk, so rolldown placed it in main and rewrote the text chunk's call as a `require_index.detectUtf16NoBom`. Identical class of bug previously fixed for the `debug` package in #11827. ## Fix 1. **`packages/file-loaders/src/loaders/index.ts`** — TextLoader was lazy-imported for no real benefit. It's a 10KB module whose only deps are `node:fs/promises` and a tiny utf-16 detect util — nothing like the multi-MB parsers (pdfjs-dist, xlsx, mammoth) that the lazy pattern was designed for. Make it a static import; `getFileLoader('txt')` returns it synchronously. Result: the text chunk disappears entirely, removing this back-reference at the source. 2. **`apps/desktop/electron.vite.config.ts`** — defensive `manualChunks` rules so any future shared symbol doesn't recreate the same trap: - `vendor-file-loaders-utils` for the three small text/binary detection utils (`detectUtf16` / `isBinaryContent` / `isTextReadableFile`). Explicitly enumerated to avoid catching `parser-utils.ts`, which pulls in xmldom/yauzl/concat-stream (≈900KB) and belongs in the docx/pptx chunks instead. - `vendor-jszip` for JSZip — same root cause for `.docx` reads: the docx chunk had `require_index.require_lib()` (JSZip) back-referencing main. Both ends now share the vendor chunk; no main re-eval. Follows the project precedent set by #11827 for `debug`. ## Verification (live Electron via CDP) Bundle inventory before/after: | Chunk | Before | After | | --- | --- | --- | | `text-*.js` | 9.7KB (back-refs main) | (gone, inlined into main) | | `vendor-file-loaders-utils-*.js` | n/a | 18KB | | `vendor-jszip-*.js` | n/a | 899KB | | `docx-*.js` back-refs | `require_index.require_lib` | none | End-to-end via `tool.invokeBuiltinTool('lobe-local-system', 'readFile', …)`: | File | Before | After | | --- | --- | --- | | `.md` / `.json` / `.ts` | `Error accessing or processing file: protocol.registerSchemesAsPrivileged should be called before app is ready` | real file content | `grep -o 'require_index\\.[a-zA-Z_]*' dist/main/*-*.js | sort -u` → empty. All 61 file-loaders tests pass; all 64 builtin-tool-local-system tests pass.

@hezhijie0327

# 🚀 LobeHub Release (20260518) **Release Date:** May 18, 2026 **Since v2.1.58:** 208 merged PRs · 209 commits · 16 contributors > v2.2.0 introduces the **Chief Agent Operator** — an agent that runs itself end-to-end. It self-iterates against its own output, assembles sub-agent teams on demand through the heterogeneous runtime, and drives a unified task system that knows when to pause for a human. Self-review, AssistantGroup, and tasks/scheduling all converge into one operator surface. --- ## ✨ Highlights ### 🎩 Chief Agent Operator - **Self-iteration exits Lab** — Agent Signal's self-review pipeline ships proposal actions straight into briefs and auto-executes the approved follow-ups, with prompts hardened against eval. The operator now critiques and re-runs its own work without a human in the loop. (#14769, #14583, #14647, #14882) - **Auto-formed agent teams** — Heterogeneous AssistantGroup gains Monitor-style signal callbacks, read-only SubAgent threads with breadcrumb headers, and a thread switcher. The operator dispatches sub-agents and you can step into any branch to see what the team is doing. (#14859, #14658, #14845, #14715) - **Task system as the operator's runway** — Claude Code surfaces task tools, AskUserQuestion freeform notes, and a dedicated `waitingForHuman` topic status; `lobe-task` exposes `setTaskSchedule`; the scheduler is hardened (maxExecutions cap, sub-10min heartbeat block, race-free SchedulerForm). Long-running operator runs no longer go silent and stop themselves when human input is needed. (#14870, #14639, #14713, #14865, #14853) ### 🚀 Cloud & runtime - **Cloud Claude Code V3** — Repo picker, GitHub token flow, and sandbox-aware context bring cloud-hosted Claude Code to feature parity with local; cloud sandbox completion now triggers the task lifecycle end-to-end. (#14568, #14822, #14681) - **Heterogeneous agent multi-replica safety** — Subagent threads, ingest refresh, and parallel-tool counts now survive replica swaps without losing parent_id or rolling back tool state. (#14897, #14631, #14806, #14838) - **Built-in tool lifecycle hooks** — `onBeforeCall` / `onAfterCall` land on the built-in tool runtime; sub-agent dispatch moves to `lobe-agent`; self-iteration aligns with the shared inspector pattern. (#14719, #14715, #14827) - **Knowledge base RAG unified** — Client and server share one `KnowledgeBaseSearchService`; KB files preserved on `NoSuchKey` instead of silently lost. (#14673, #14501) ### 💬 Workspace experience - **Home daily brief + recommendations** — The home screen opens with a linkable welcome, paired input hint, and a recommendations module sourced from the operator's hetero action library. (#14589, #14645, #14770) - **Chat mode + redesigned action bar** — The chat input gains a Chat/Agent mode toggle and a re-pitched action bar with icon-and-color action tag chips. (#14774, #14903, #14846) - **Documents tree, optimistic** — Document tree creates, deletes, and inline renames now apply optimistically; the agent-documents index hides web crawls and switches to a table layout. (#14714, #14292) - **Branded MCP inspectors** — Linear MCP tool calls render with the same branded inspector as the built-in Linear skill; CC MCP and built-in skills now share inspector code. (#14864, #14884) - **Bot identity gating** — Device tools are gated by sender identity, the activator bypass is closed, and Slack mpim plus Discord DM regressions are fixed. (#14634, #14664, #14733) --- ## 🏗️ Core Agent & Signal Pipeline ### Self-iteration & Agent Signal - Self-iteration graduates out of Lab, with service, tool, name, and concept structure unified across `agent-signal`, `prompts`, `database`, and `builtin-tool-self-iteration`. (#14699, #14769) - Self-review now proposes actions to briefs and auto-executes the approved set, with eval-verified prompt hardening. (#14583, #14657, #14647) - Self-iteration built-in tool aligns with the shared runtime + inspector patterns. (#14827) - Agent Signal prompts adapt their response language and avoid blocking agent execution. (#14890, #14775, #14882) - Receipt descriptions now carry an Agent Signal marker, and self-review hinted skill documents route correctly. (#14764, #14895) ### Heterogeneous agent runtime - Subagent threads render read-only with a breadcrumb header and thread switcher; SUBAGENT badge dropped, indentation tightened. (#14658, #14845, #14783) - Multi-replica safety: ingest refresh restores tools/model from DB to fix parent_id breaks; new-step assistants sync across replicas; subagent-tagged events no longer leak into the main gateway handler. (#14897, #14631, #14838) - Fetch-triggering events are deferred to keep parallel tool counts from rolling back. (#14806) - AskUserQuestion is wired for Claude Code, with auto-decline disabled and a freeform note input on the cloud side; `waitingForHuman` is a first-class topic status. (#14639, #14629, #14870) - AssistantGroup gains Monitor-style signal callbacks; project skills surface in the working sidebar and markdown preview. (#14859, #14896) - Cloud Claude Code V3 — repo picker, GitHub token, sandbox context; credentials alert and disabled input when not configured. (#14568, #14822) - Cloud sandbox completion now triggers the task lifecycle end-to-end. (#14681) ### Agent runtime & context engine - Built-in tool runtime gets `onBeforeCall` / `onAfterCall` lifecycle hooks. (#14719) - `CompletionLifecycle`, `HumanInterventionHandler`, and `stepPresentation` are extracted from the runtime monolith. (#14441) - Per-tool timeout is honored end-to-end for client tool dispatch. (#14817) - Compression budget accounts for `tool_calls`, reasoning content, and tool defs; `call_llm` forwards tools into the budget. (#14813, #14837) - Pre-flight context check now fails fast for OpenAI-compatible providers. (#14824) - Malformed `tool_call` names are recovered instead of finishing the step silently. (#14577) - Sub-agent dispatch moves from `lobe-gtd` to `lobe-agent`. (#14715) - Hidden built-in tools now appear in the system prompt @-mention list. (#14823) ### Agent tracing & operations - New `agent_operations` table and runtime persistence for every hetero-agent operation. (#14416, #14736) - `signOperationJwt` issues 4-hour signed operation tokens. (#14586) - S3 trace snapshots are zstd-compressed; DB `trace_s3_key` aligns with the `.json.zst` suffix; legacy `.json` fallback preserved on fetch. (#14807, #14860, #14826) --- ## 📱 Platform & Integrations ### Bot / Channels - Device tools are gated by sender identity. (#14634) - Activator bypass closed and device-access checks converged. (#14664) - Slack mpim supported; Discord DM regression fixed; Slack connect + slash commands repaired. (#14733, #14591) - Bot channels, bot watch, bot callback service, and system bot reliability fixes. (#14847, #14796, #14570, #14784, #14649) - Online Messager scaffolding. (#14755) ### Onboarding - Home daily brief with linkable welcome and paired input hint. (#14589) - Recommendations module sourced from the hetero agent action library. (#14645) - Chat onboarding passes request triggers via metadata and preserves the resume request. (#14770, #14798) - Discovery turn progress gated by phase, with a reminder on stalled discovery. (#14842, #14833) - FullNameStep back button rejoins the shared prefix; ModeSwitch hidden in production. (#14898, #14760) - Agent marketplace folds into the web onboarding tool. (#14578, #14672) - Onboarding interests stored as keys instead of free text; early-exit skips marketplace and drops CJK prompts. (#14624, #14598) ### Model providers - Gemini 3.1 Flash-Lite cards; Gemini schema sanitizer drops non-compliant `enum` / `required`; zero `cachedContentTokenCount` handled in usage conversion. (#14604, #14740, #14567) - DeepSeek-V4 model cards and pricing restored to official rates. (#14110, #14911) - ernie-5.1 and spark-x2-flash support; Grok 4.3 `reasoning_effort` support. (#14643, #14731, #14642) - SiliconCloud catalog synced with API; duplicates removed; reasoning params adjusted. (#14464) - Minimax derives `max_tokens` from context window to avoid `ExceededContextWindow`. (#14814) - aihubmix uses the full models endpoint for a complete list; stale empty-apiKey test dropped. (#14511, #14669) - Stream parse errors are enriched with provider + model context. (#14636) - Visual content parts are consumed in the server runtime; video image references move to a JSON object. (#14637, #14900) - Google function call magic `thoughtSignature` now attaches to every part, not just the last turn. (#14904) - Service model assignments settings added; model extend-param options removed. (#14712, #14607) ### Built-in tools & knowledge base - `lobe-task` exposes `setTaskSchedule`; task scheduler hardened (maxExecutions cap, sub-10min heartbeat blocked, SchedulerForm race fix, rapid automation-mode toggle stabilized). (#14713, #14865, #14853, #14801) - KnowledgeBaseSearchService shares RAG runtime across client and server. (#14673) - KB files preserved on `NoSuchKey` and orphan documents/tasks cleaned. (#14501) - Document tree gets optimistic create/delete + inline rename. (#14714) - agent-documents index hides web crawls and switches to a table layout. (#14292) - `lobe-clarify` and SKILL.md frontmatter parsing/edit validation are unified. (#14566) - AnalyzeVisualMedia inspector + Portal HTML preview refactor; HTML preview restored for AssistantGroup messages. (#14777, #14811) - Branded inspector shared between CC MCP and built-in Linear skill. (#14884, #14864) --- ## 🖥️ CLI & User Experience ### Chat & Conversation - Chat mode toggle and redesigned chat input action bar. (#14774) - Action tag chips switch to icon + colored label; ActionDropdown closes on sibling-open and focus-out; submenu uses native header/footer slots. (#14903, #14802, #14901) - Action bar padding equalized around the send button; skeleton shows in action bar while config loads. (#14846, #14656) - `useCmdEnterToSend` is respected in thread & task inputs; send button enables after pasting into thread/comment input. (#14850, #14816) - TopicChatDrawer state preserved during close animation. (#14803) - Only the last assistant block animates during markdown streaming. (#14906) - Right working panel no longer auto-collapses on chat mount; home agent config fetched so knowledge toggles reflect in UI. (#14883, #14834) ### Tasks - Task scheduler, hotkey, comment, and TodoList polish. (#14707) - Add Subtask button & card baseline aligned; activity card stop run; task agent manager polish. (#14848, #14559, #14569) - Task template skeleton CLS reduced; task page placeholder copy refreshed. (#14788, #14704) - Task agent model snapshotted into `task.config` at create time. (#14670) - User-feedback card, task card polish, and Run-now context menu in markdown. (#14727) - Inline skill auth in recommended task templates. (#14676) ### Navigation & Layout - Tab bar gains a Chrome-style divider between inactive tabs. (#14892) - SideBarDrawer & header layout polish; nav ActionIcon sizing unified; TodoList encapsulation improved. (#14762, #14692) - Desktop header icons, sidebar density, and task menus polished. (#14724) - Standardized header action icon sizes. (#14717) - Chat topic title length increased; copy session ID added to topic dropdown menu. (#14659, #14595) - Heterogeneous agent topic rows regain indentation. (#14783) ### Other polish - Usage token details shortened; tool execution time formatted as `Xmin Ys`. (#14849, #14641) - Tool arguments display gets word-wrap toggle; long tool-call params wrap instead of truncate. (#14706, #14640) - Editor stops showing per-line placeholder once content is present. (#14852) - Visible divider between queued messages; intervention confirmation bar polished. (#14593, #14587) - Credit top-up copy refreshed; auth captcha retry copy refreshed; brief recommendations layout polished. (#14821, #14561, #14871) --- ## 🔧 Tooling & Developer Experience - Dev-only feature flag override panel. (#14565) - `__DEV__` define replaces `process.env.NODE_ENV` in the SPA. (#14696) - Agent-settings drops Meta/Documents tabs and restores `inputTemplate`. (#14874) - `local-system` forwards all `grepContent` params and moves the executor to `/client`. (#14888) - `lobe-task` and `setTaskSchedule` exposed. (#14713) - Memory user-memory benchmark agent config and source-id extraction schemas. (#14779, #14778) - CLI man page drops stale cron entry; `clearMessages` hotkey removed. (#14709, #14906) - Skill docs simplified; cloud heteroContext gains sandbox TTL + public-repo fork push guide. (#14785, #14761) --- ## 🔒 Security & Reliability - **Security:** Sensitive comments and examples sanitized from the production JS bundle. (#14557) - **Security:** Inactive OIDC access rejected. (#14674) - **Security:** CASC `new Function()` template replaced with safe string builders. (#14751) - **Security:** Sign-in captcha flow removed in favor of safer flow. (#14573) - **Security:** Desktop local file previews restricted to safe roots. (#14789) - **Security:** Image binary capped at 3.75 MB so base64 payload stays under the Anthropic 5 MB limit. (#14711) - **Reliability:** Neon/Node pools get error listeners to prevent Lambda crashes. (#14606) - **Reliability:** `paradedb.match(...)` replaces hardcoded normalizer in memory search. (#14590) - **Reliability:** `PlaceholderVariablesProcessor` errors carry diagnostic context. (#14741) - **Reliability:** File storage upload checks are serialized; multiple account link bug fixed. (#14829, #14562) - **Reliability:** `ScrollShadow` replaced with `ScrollArea` to fix a React infinite render loop (error code 185). (#14689) - **Reliability:** Embedding token cap enforced — long memory queries are limited and truncated before search. (#14757) - **Reliability:** Embed binary blob guard + oversized output cap in `local-system.readFile`. (#14602) - **Reliability:** Windows npm CLI shims resolved before spawning agents. (#14772, #14720) - **Reliability:** Vite pinned to 8.0.12 to avoid the rolldown 1.0.1 preload regression; desktop runtime externals split from native deps. (#14804, #14776) - **Reliability:** Old lobehub cron job removed; WeChat URL rules dropped from web crawler. (#14630, #14633) --- ## 👥 Contributors Huge thanks to **16 contributors** who shipped **208 merged PRs** this cycle. @hezhijie0327 · @sxjeru · @hardy-one · @Bianzinan · @brone1323 · @YuSaZh · @Wxh16144 · @arvinxx · @Innei · @tjx666 · @neko · @lijian · @rdmclin2 · @sudongyuer · @AmAzing129 · @rivertwilight Plus @lobehubbot for maintenance translations. --- **Full Changelog**: v2.1.58...v2.2.0

dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. feature:tool Tool calling and function execution labels May 9, 2026

sourcery-ai Bot reviewed May 9, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 9, 2026

View reviewed changes

Comment thread packages/file-loaders/src/utils/isBinaryContent.ts

vercel Bot deployed to Preview May 9, 2026 16:56 View deployment

vercel Bot deployed to Preview May 9, 2026 17:18 View deployment

arvinxx and others added 4 commits May 10, 2026 01:42

arvinxx force-pushed the fix/lobe-8703-readfile-binary-protection branch from f641d84 to e37ecda Compare May 9, 2026 17:42

vercel Bot deployed to Preview May 9, 2026 17:51 View deployment

arvinxx merged commit ca6c9ad into canary May 10, 2026
35 checks passed

arvinxx deleted the fix/lobe-8703-readfile-binary-protection branch May 10, 2026 04:01

arvinxx mentioned this pull request May 12, 2026

🚀 release: 20260512 #14722

Closed

arvinxx mentioned this pull request May 18, 2026

🚀 release: v2.2.0 #14915

Merged

lobehubbot mentioned this pull request May 18, 2026

ProviderBizError: { message: 'fetch failed' } #14930

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 fix(local-system): guard readFile against binary blobs and oversized output#14602

🐛 fix(local-system): guard readFile against binary blobs and oversized output#14602
arvinxx merged 4 commits into
canaryfrom
fix/lobe-8703-readfile-binary-protection

arvinxx commented May 9, 2026 •

edited

Loading

Uh oh!

vercel Bot commented May 9, 2026 •

edited

Loading

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

codecov Bot commented May 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

arvinxx commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

WriteFile render — match EditLocalFile

Test refactor — readFile / readFiles end-to-end

Test plan

Uh oh!

vercel Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

codecov Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

arvinxx commented May 9, 2026 •

edited

Loading

vercel Bot commented May 9, 2026 •

edited

Loading

codecov Bot commented May 9, 2026 •

edited

Loading