perf(filesearch): move AsyncFzf index construction to a worker thread#4621
Conversation
| * Optional because the directory-only path holds no such resources. | ||
| * Implementations must be safe to call multiple times. | ||
| */ | ||
| dispose?(): Promise<void>; |
There was a problem hiding this comment.
[Critical] Worker thread leak in the CLI's @-picker path. useAtCompletion (in packages/cli/src/ui/hooks/useAtCompletion.ts) creates a FileSearch via FileSearchFactory.create() and stores it in fileSearch.current. When cwd/config changes trigger re-initialization or the component unmounts, the old FileSearch is silently overwritten without calling dispose(). The VSCode path was correctly wired up in FileMessageHandler.clearFileSearchCache(), but the CLI hook — by far the most common consumer — was missed.
On workspaces with ≥5 000 files, each RecursiveFileSearch holds a FzfWorkerHandle that owns a worker_threads Worker (~5–10 MB + full file list). In a long-running CLI session, every directory switch or config reload leaks another worker. worker.unref() prevents blocking process.exit(), but the workers accumulate for the lifetime of the process.
Fix in useAtCompletion.ts:
| dispose?(): Promise<void>; | |
| // In the useEffect cleanup: | |
| return () => { | |
| searchAbortController.current?.abort(); | |
| if (slowSearchTimer.current) { | |
| clearTimeout(slowSearchTimer.current); | |
| } | |
| void fileSearch.current?.dispose?.(); | |
| fileSearch.current = null; | |
| }; |
Also dispose the previous instance before overwriting in initialize().
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in 81a513d. Added dispose() calls in both the initialize() path (dispose old instance before creating new one) and the effect cleanup function. Worker threads are now properly released on re-init and unmount.
| const inst = transport.spawn(files, options); | ||
| // Surface init errors at create() time so callers don't get a "fzf not | ||
| // initialized" surprise on the first find(). | ||
| await inst.ready(); |
There was a problem hiding this comment.
[Suggestion] Two robustness gaps in create():
-
Worker leak on init failure: If
inst.ready()rejects (worker sendsinit-erroror crashes during construction),create()propagates the error without callinginst.dispose(). The spawnedWorkeris never terminated. In daemon mode, repeated init failures accumulate idle workers. -
No timeout on
readyPromise: If the worker hangs duringnew AsyncFzf(...)(e.g., OOM on a huge file list, OS scheduling stall),create()blocks the caller indefinitely with no diagnostic output. The @-picker UI freezes on "loading..." with no error, no stack, no way out.
| await inst.ready(); | |
| const inst = transport.spawn(files, options); | |
| try { | |
| const timeoutMs = Math.max(10_000, files.length / 10); | |
| await Promise.race([ | |
| inst.ready(), | |
| new Promise<never>((_, rej) => | |
| setTimeout(() => rej(new Error( | |
| `fzf worker init timed out after ${timeoutMs}ms (${files.length} files)` | |
| )), timeoutMs), | |
| ), | |
| ]); | |
| } catch (err) { | |
| await inst.dispose(); | |
| throw err; | |
| } | |
| return new FzfWorkerHandle(inst); |
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in 81a513d. create() now wraps inst.ready() in a try/catch that calls dispose() on failure. Also added a Promise.race timeout scaled to file count (Math.max(10_000, files.length / 10) ms) so a hung worker doesn't block the @-picker indefinitely.
| // Strip the heavy `positions` Set from each item before sending — | ||
| // structuredClone serialises Sets but the @-picker only needs the | ||
| // ranked `item` strings. Keeps IPC payloads small on big result sets. | ||
| const trimmed = items.map((entry) => ({ item: entry.item })); |
There was a problem hiding this comment.
[Suggestion] fzf.find() returns ALL matching items, which are all serialized via postMessage structured clone. On a large workspace with a broad single-character pattern, this can produce 50k+ results (~2.5 MB of IPC payload). The consumer in fileSearch.ts truncates to maxResults (~72 items) only after the full result set crosses the message channel.
Consider adding a limit field to the FindMessage protocol and slicing results in the worker before serialization:
// In FindMessage:
interface FindMessage {
type: 'find';
reqId: number;
pattern: string;
limit?: number;
}
// In the .then() handler:
const limit = msg.limit ?? items.length;
const trimmed = items.slice(0, limit).map((entry) => ({ item: entry.item }));A generous limit (e.g., 200) caps IPC at ~10 KB while giving the main thread enough candidates for downstream filtering.
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in 81a513d. Added limit?: number to FindMessage protocol. Worker now calls .slice(0, limit) before postMessage. The caller in fileSearch.ts passes Math.max(200, maxResults * 3) — generous enough for downstream ignore-filter to drop entries without starving results, while capping IPC at ~10 KB instead of multi-MB payloads on broad patterns.
| installInProcessFzfTransport, | ||
| } from './fzfWorkerHandle.js'; | ||
|
|
||
| describe('FzfWorkerHandle', () => { |
There was a problem hiding this comment.
[Suggestion] The worker transport code path (138 lines in fzfWorkerHandle.ts — the core feature of this PR) has zero test coverage. All 5 tests here use either the default in-process fallback (tiny inputs below the 5 000 threshold) or explicitly call installInProcessFzfTransport(). No test spawns a real worker_threads Worker.
Untested paths include:
- Worker spawn → init → find → dispose lifecycle
init-error/find-errormessage handling- Worker unexpected exit →
failAll()rejection of pending promises - Dispose-then-find race condition (pending finds should reject)
The in-process tests verify the interface contract, but bugs in IPC message handling, result reconstruction (positions: new Set<number>() reassembly), or worker lifecycle would go undetected.
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in 81a513d. Added a worker transport (real worker_threads) describe block with 5 tests that use __setWorkerThresholdForTests(1) without installInProcessFzfTransport() to exercise the real spawn/IPC path:
- spawn → init → find → dispose lifecycle
- find() rejects after dispose()
- limit parameter across IPC
- create() rejects and disposes on init failure
- concurrent find() calls
The tests build fzfWorker.js on-the-fly via esbuild in beforeAll and clean up in afterAll. Also discovered and fixed a bug: find() after dispose() would hang forever — now rejects immediately.
DragonnZhang
left a comment
There was a problem hiding this comment.
Review Summary
Verdict: COMMENT (not approving — critical lifecycle issues need resolution; CI also failing on Windows)
Overall Assessment
The architecture is well-designed: the worker/main-thread transport abstraction is clean, the threshold-based fallback is pragmatic, and the protocol between worker and handle is simple and correct. The esbuild config restructure to emit a standalone dist/fzfWorker.js is the right approach. Code quality is high throughout — good comments, clear separation of concerns, and the installInProcessFzfTransport() test hook avoids the pitfalls of global test setup.
However, the existing inline comments from @wenshao identify genuine blockers that must be addressed before merge:
-
Worker thread leak in
useAtCompletion(Critical,fileSearch.ts:106) — The CLI's @-picker hook createsFileSearchinstances but never calls the newdispose()method, neither oncwd/configchange nor on component unmount. With the worker transport active on large workspaces, this accumulates orphaned worker threads over a session. -
Worker leak on init failure in
create()(fzfWorkerHandle.ts:309) — Ifinst.ready()rejects, the spawned worker is never disposed. Atry/catcharoundawait inst.ready()that callsinst.dispose()on failure would close this gap. -
No test coverage for the worker transport path (
fzfWorkerHandle.test.ts:14) — All 5 tests exercise the in-process fallback. The worker transport (the actual feature of this PR) is untested. At minimum, a test using__setWorkerThresholdForTests(1)withoutinstallInProcessFzfTransport()would verify the spawn/message/dispose lifecycle.
Independent Findings
I independently verified the following and found them correct:
- The message protocol between worker and handle is sound — init/find/result/dispose sequencing is clean,
failAllhandles error and unexpected exit correctly. worker.unref()correctly prevents the worker from blocking process exit.RecursiveFileSearch.dispose()is properly idempotent (interface requirement satisfied).- The esbuild parallel build via
Promise.allis correct — outputs go to separate paths, and the worker build's metafile omission is intentional. getWorkerScriptPath()caching is acceptable for production; theexistsSyncfallback fornpm run dev(tsx) is a documented and reasonable trade-off.
CI Note
The Windows test failure (Test (windows-latest, Node 22.x)) should also be investigated before merge, as it may relate to worker_threads spawn latency differences on Windows.
Findings: 0 new inline comments (all critical issues already covered by existing review). 3 areas need attention before merge.
— qwen-code via Qwen Code /review
|
@qwen-code /review |
5b4a301 to
81a513d
Compare
| if (slowSearchTimer.current) { | ||
| clearTimeout(slowSearchTimer.current); | ||
| } | ||
| void fileSearch.current?.dispose?.(); |
There was a problem hiding this comment.
[Critical] Effect cleanup disposes the worker on every dependency change, silently breaking @-completion search.
The effect's dependency array is [state.status, state.pattern, config, cwd]. Every status transition (e.g., INITIALIZING→READY after initialize() completes, or READY→SEARCHING on keystroke) triggers this cleanup, which disposes the just-created FzfWorkerHandle and sets fileSearch.current = null.
Trace: initialize() creates a searcher, assigns fileSearch.current, dispatches INITIALIZE_SUCCESS → status becomes READY → cleanup fires → dispose() kills the worker → fileSearch.current = null. Then SEARCHING triggers search() which early-returns because !fileSearch.current.
The base version's cleanup only aborted the search controller and cleared the timer — the searcher ref was preserved across re-renders.
| void fileSearch.current?.dispose?.(); | |
| searchAbortController.current?.abort(); | |
| if (slowSearchTimer.current) { | |
| clearTimeout(slowSearchTimer.current); | |
| } |
Move the dispose() + null assignment to a dedicated lifecycle effect with [cwd, config] deps (matching the existing RESET dispatch), so it only fires on workspace/config changes and unmount — not on every keystroke:
useEffect(() => {
return () => {
void fileSearch.current?.dispose?.();
fileSearch.current = null;
};
}, [cwd, config]);— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in f3b9666. Moved dispose() out of the worker effect cleanup (which fires on every [state.status, state.pattern] change) into a separate effect with [cwd, config] deps — now only fires on workspace/config change and unmount.
| await Promise.race([ | ||
| inst.ready(), | ||
| new Promise<never>((_, rej) => | ||
| setTimeout( |
There was a problem hiding this comment.
[Suggestion] setTimeout inside Promise.race is never cleared when inst.ready() resolves first (the happy path). The timer stays alive for Math.max(10_000, files.length / 10) ms (10 s for ≤100k files, up to 100 s at MAX_CRAWL_FILES), holding a closure over the files array and preventing GC.
| setTimeout( | |
| const timeoutMs = Math.max(10_000, files.length / 10); | |
| let timerId: ReturnType<typeof setTimeout> | undefined; | |
| try { | |
| await Promise.race([ | |
| inst.ready(), | |
| new Promise<never>((_, rej) => { | |
| timerId = setTimeout( | |
| () => | |
| rej( | |
| new Error( | |
| `fzf worker init timed out after ${timeoutMs}ms (${files.length} files)`, | |
| ), | |
| ), | |
| timeoutMs, | |
| ); | |
| }), | |
| ]); | |
| } catch (err) { | |
| await inst.dispose(); | |
| throw err; | |
| } finally { | |
| if (timerId !== undefined) clearTimeout(timerId); | |
| } |
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in f3b9666. Added timerId variable and clearTimeout(timerId) in a finally block so the timer is cleaned up on both success and failure paths.
| const previous = this.fileSearchInstances.get(rootPath); | ||
| this.fileSearchInstances.delete(rootPath); | ||
| this.fileSearchInitializing.delete(rootPath); | ||
| void previous?.dispose?.().catch(() => { |
There was a problem hiding this comment.
[Suggestion] setupFileWatchers() returns a Disposable that disposes file watchers and the workspace-folders change listener, but never iterates this.fileSearchInstances to call dispose() on the cached FileSearch instances. When the webview panel closes, any RecursiveFileSearch instances that promoted to the worker path (≥5k files) retain live worker threads.
Consider extending the Disposable returned by setupFileWatchers():
return {
dispose: () => {
for (const watcher of this.fileWatchers.values()) watcher.dispose();
this.fileWatchers.clear();
foldersChangeListener.dispose();
for (const instance of this.fileSearchInstances.values()) {
void instance.dispose?.().catch(() => {});
}
this.fileSearchInstances.clear();
this.fileSearchInitializing.clear();
},
};— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in f3b9666. Extended the Disposable returned by setupFileWatchers() to iterate fileSearchInstances and call dispose() on each, plus clear the initializing map.
| files: [ | ||
| 'cli.js', | ||
| // Worker thread entry loaded by FzfWorkerHandle at runtime via | ||
| // `new Worker(new URL('./fzfWorker.js', import.meta.url))`. Must ship |
There was a problem hiding this comment.
[Suggestion] The comment says new Worker(new URL('./fzfWorker.js', import.meta.url)) but the actual code in fzfWorkerHandle.ts uses resolveBundleDir(import.meta.url) + path.join(dir, 'fzfWorker.js') — a filesystem-based path resolution, not a new URL() constructor pattern.
| // `new Worker(new URL('./fzfWorker.js', import.meta.url))`. Must ship | |
| // Worker thread entry loaded by FzfWorkerHandle at runtime via | |
| // `resolveBundleDir(import.meta.url)` + `path.join(dir, 'fzfWorker.js')`. | |
| // Must ship in the tarball or the @-picker silently falls back to the | |
| // in-thread AsyncFzf path on big workspaces in npm-installed CLIs. |
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in f3b9666. Updated comment to reference resolveBundleDir(import.meta.url) + path.join(dir, 'fzfWorker.js') instead of the new URL() pattern.
| pending.clear(); | ||
| }; | ||
|
|
||
| worker.on('error', (err) => { |
There was a problem hiding this comment.
[Suggestion] worker.on('error') and worker.on('exit') handlers that call failAll() have no test coverage. If failAll() has a bug, pending find() calls could hang indefinitely in production when the worker crashes.
Consider adding a test that forces a worker crash (e.g., by passing a script that throws at the top level) and verifying that pending find() calls reject with an appropriate error message.
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in f3b9666. Added a test that overwrites fzfWorker.js with a script that calls process.exit(1) on find, then verifies handle.find() rejects with /exited unexpectedly/. This exercises the worker.on('exit') → failAll() path.
| maxFiles: MAX_CRAWL_FILES, | ||
| }); | ||
| this.buildResultCache(); | ||
| await this.buildResultCache(); |
There was a problem hiding this comment.
[Suggestion] RecursiveFileSearch.dispose() is a new public method but fileSearch.test.ts has no tests for it. Callers (useAtCompletion.ts, FileMessageHandler.ts) rely on dispose() for worker thread cleanup — a regression here would silently leak workers.
Consider adding a test:
it('should release fzf handle on dispose', async () => {
const fileSearch = FileSearchFactory.create({ ... });
await fileSearch.initialize();
await expect(fileSearch.dispose()).resolves.toBeUndefined();
// Idempotent
await expect(fileSearch.dispose()).resolves.toBeUndefined();
});— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in f3b9666. Added two tests in fileSearch.test.ts: one verifying RecursiveFileSearch.dispose() resolves and is idempotent, another confirming DirectoryFileSearch has no dispose method (expected — it holds no worker resources).
`new AsyncFzf(allFiles, ...)` is misleadingly named — its constructor is synchronous and dominates the @-picker's main-thread cost on workspaces with >20k files (where the v2→v1 algorithm switch kicks in). It's the remaining freeze source after PR QwenLM#3214 moved the crawl onto async spawn- based git ls-files / ripgrep / fdir. This change hosts the AsyncFzf instance in a worker_threads worker so the construction happens off the main thread; only completed find() results cross the message channel. Below ~5k files the in-thread path is kept because worker spawn + IPC overhead exceeds the construction cost there. Notes: - `FzfWorkerHandle.create()` awaits the worker `ready` event before returning, so init errors surface at create() time rather than as a surprise on the first find(). - Worker is `unref()`-ed so a mid-find @-picker doesn't pin process exit. - `FileSearch` interface gains an optional `dispose?(): Promise<void>` to release worker resources. `FileMessageHandler.clearFileSearchCache` calls it so the long-running VS Code extension host doesn't accumulate workers across watcher-triggered evictions. - `installInProcessFzfTransport()` is provided for tests / sandboxed environments that need to skip worker spawn entirely. - Bundle gains a second esbuild build that emits `dist/fzfWorker.js` alongside `dist/cli.js`; `prepare-package.js` ships it in the tarball. - In `npm run dev` (tsx) the bundled worker file isn't on disk, so the handle silently falls back to in-thread — acceptable for a dev script. Closes the door on QwenLM#3455's two open `FileIndexService` review items by design: there is no `whenReady()` (handle awaits ready inside `create()`) and there is no in-worker crawl (crawl stays on main, only fzf moves). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Fix worker thread leak in useAtCompletion: dispose previous FileSearch instance on re-init and component cleanup - Fix worker leak on init failure: wrap inst.ready() in try/catch with timeout, dispose on failure - Fix find()-after-dispose hang: reject immediately when worker is disposed - Add limit field to FindMessage protocol to cap IPC payload before postMessage serialization - Add real worker_threads test coverage (spawn/find/dispose lifecycle, concurrent finds, init failure, post-dispose rejection) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix useAtCompletion cleanup: move dispose() to [cwd,config] effect so it only fires on workspace change, not every keystroke/status transition - Clear setTimeout in Promise.race on happy path to prevent timer leak - Dispose FileSearch instances in FileMessageHandler on panel close - Fix incorrect comment in prepare-package.js (resolveBundleDir, not URL) - Add worker crash test: verify failAll rejects pending find() on exit - Add RecursiveFileSearch.dispose() test in fileSearch.test.ts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
f3b9666 to
6be81d3
Compare
| const initialize = async () => { | ||
| try { | ||
| // Dispose previous instance to prevent worker thread leaks on | ||
| // re-initialization (cwd/config change triggers RESET → re-init). |
There was a problem hiding this comment.
[Suggestion] The async initialize() is fire-and-forget with no cancellation guard. If the component unmounts or cwd/config changes while await searcher.initialize() is in flight (worker spawn + ready handshake takes several ms), the cleanup effect runs fileSearch.current?.dispose?.() — but fileSearch.current is still null from line 164, so nothing is disposed. When the pending initialize() resolves, fileSearch.current = searcher writes a live worker into the ref with no owner to dispose it.
In React 18 Strict Mode (dev), the double-invocation makes this deterministic: invocation-1's async init completes after cleanup already ran, leaking a worker thread.
Consider adding a cancelled flag or AbortController that the effect cleanup signals:
const initialize = async (signal: AbortSignal) => {
try {
await fileSearch.current?.dispose?.();
fileSearch.current = null;
const searcher = FileSearchFactory.create({ ... });
await searcher.initialize();
if (signal.aborted) { await searcher.dispose?.(); return; }
fileSearch.current = searcher;
// ...
} catch (err) { /* ... */ }
};— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in 0dae073. Added a cancelled flag inside the effect closure — set to true in cleanup. After await searcher.initialize(), if cancelled is true the searcher is disposed immediately instead of being written to fileSearch.current. This covers both the unmount-during-init and the Strict Mode double-invocation scenarios.
|
|
||
| describe('worker transport (real worker_threads)', () => { | ||
| let workerExists = false; | ||
| const outfile = path.resolve(import.meta.dirname, 'fzfWorker.js'); |
There was a problem hiding this comment.
[Suggestion] beforeAll builds fzfWorker.js into the source tree (packages/core/src/utils/filesearch/fzfWorker.js), but .gitignore has no entry for it. If the test process crashes (OOM, Ctrl-C, segfault) before afterAll runs fs.unlinkSync(outfile), the compiled artifact persists and could be accidentally committed.
A stale fzfWorker.js in the source tree also causes getWorkerScriptPath() to return non-null in tsx/dev mode, making the dev path unexpectedly use the worker transport with a potentially outdated build.
Add packages/core/src/utils/filesearch/fzfWorker.js to .gitignore, or write the test artifact to os.tmpdir() instead.
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in 0dae073. Added packages/core/src/utils/filesearch/fzfWorker.js to .gitignore so a crash-orphaned test artifact won't be accidentally committed or cause dev-mode to use a stale worker build.
DragonnZhang
left a comment
There was a problem hiding this comment.
Review Summary: perf(filesearch): move AsyncFzf index construction to a worker thread
Verdict: COMMENT (1 finding)
This is a well-engineered PR. The transport abstraction (in-process vs worker_threads) with threshold-based selection is clean, the IPC protocol is simple and correct, and disposal is handled properly at all integration points (useAtCompletion, FileMessageHandler, RecursiveFileSearch). The test coverage is thorough — covering both transports, concurrent find(), crash recovery, and dispose idempotency.
Findings: 1 medium-severity
MEDIUM — Concurrent initialize() can leak worker threads (useAtCompletion.ts:164)
When cwd/config changes rapidly, overlapping initialize() calls can leak a worker thread. The await fileSearch.current?.dispose?.() only disposes the assigned reference. If two inits overlap, the first init's in-flight searcher (with its worker) is never disposed when the second init overwrites fileSearch.current.
This race pre-dates the PR, but the cost is materially higher with worker_threads (~1–4 MB stack + V8 isolate per leaked thread vs bounded heap for in-process AsyncFzf). A generation counter or AbortController in the initialize effect would close the gap.
What looks good
worker.unref()prevents the worker from blocking process exit- Init timeout scaled by file count (
Math.max(10_000, files.length / 10)) is pragmatic - Stripping
positionsSet before IPC serialization keeps payloads small installInProcessFzfTransport()with mandatory restorer avoids test pollution- Dual esbuild builds (main + worker) with
Promise.allis clean prepare-package.jswhitelistingfzfWorker.jsensures the tarball includes the worker
| try { | ||
| // Dispose previous instance to prevent worker thread leaks on | ||
| // re-initialization (cwd/config change triggers RESET → re-init). | ||
| await fileSearch.current?.dispose?.(); |
There was a problem hiding this comment.
MEDIUM — Concurrent initialize() calls can leak worker threads
The await fileSearch.current?.dispose?.() here only disposes the previously assigned reference. When cwd/config changes rapidly, two initialize() invocations can overlap:
- Init A starts — disposes old searcher, creates
searcherA(spawns worker A), awaitssearcherA.initialize(). cwdchanges again — Init B starts —fileSearch.currentis stillnull(Init A hasn't set it yet) — dispose is a no-op.- Init A completes — sets
fileSearch.current = searcherA. - Init B completes — sets
fileSearch.current = searcherB—searcherAand its worker thread are leaked.
This race pre-dates the PR, but each leaked worker_threads worker holds ~1–4 MB stack + a V8 isolate, making the leak materially more expensive than the old in-process AsyncFzf.
| await fileSearch.current?.dispose?.(); | |
| const initialize = async () => { | |
| const myGeneration = ++initGeneration; | |
| try { | |
| // Dispose previous instance to prevent worker thread leaks on | |
| // re-initialization (cwd/config change triggers RESET → re-init). | |
| await fileSearch.current?.dispose?.(); | |
| fileSearch.current = null; | |
| const searcher = FileSearchFactory.create({ |
A module-level let initGeneration = 0; plus a stale-check after await searcher.initialize() would cancel superseded initializations:
await searcher.initialize();
if (myGeneration !== initGeneration) {
// A newer initialize() already started — dispose ours and bail.
await searcher.dispose?.();
return;
}
fileSearch.current = searcher;There was a problem hiding this comment.
Fixed in 0dae073. Added a cancelled flag inside the effect — the cleanup sets it to true, and after await searcher.initialize() the code checks it before assigning to fileSearch.current. If cancelled (because a newer init already started or the component unmounted), the newly-created searcher is disposed immediately. This prevents the overlap scenario you described where Init A's searcher would be orphaned when Init B overwrites the ref.
The second test case in the formatDateForContext describe block was missing its closing `});`, causing a TS1005 parse error that broke CI on all platforms. This was introduced upstream in 6a99ed1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add cancelled flag to initialize() effect so a superseded or unmounted init disposes the newly-created searcher instead of writing it into fileSearch.current with no owner - Add fzfWorker.js test artifact to .gitignore to prevent accidental commits if the test process crashes before afterAll cleanup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| const failAll = (err: Error) => { | ||
| if (readyState === 'pending') { | ||
| readyState = 'failed'; | ||
| readyReject(err); |
There was a problem hiding this comment.
[Suggestion] failAll() only transitions readyState to 'failed' when the current state is 'pending'. If the worker crashes after a successful init (state 'ready'), readyState remains 'ready' permanently. Subsequent find() calls pass the 'disposed' guard, await readyPromise resolves instantly, and worker.postMessage() throws on a dead worker. Combined with the .catch(() => { return []; }) in fileSearch.ts:178, the @-picker silently returns zero results for every keystroke with no self-healing until cwd/config change.
Consider making failAll unconditionally mark the handle as failed when not disposed:
const failAll = (err: Error) => {
if (readyState === 'pending') {
readyReject(err);
}
if (readyState !== 'disposed') {
readyState = 'failed';
}
for (const slot of pending.values()) {
slot.reject(err);
}
pending.clear();
};And guard find() against the 'failed' state:
async find(pattern, limit?) {
if (readyState === 'disposed' || readyState === 'failed') {
throw new Error(`fzf worker not available (${readyState})`);
}
// ...
}— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in 769a236. failAll() now unconditionally sets readyState = 'failed' when the state is not 'disposed', so a post-init worker crash correctly marks the handle as dead. find() now guards against both 'disposed' and 'failed' states, rejecting immediately instead of silently returning empty results via the .catch(() => []) in fileSearch.ts. Added a test that verifies: crash worker on first find(), then assert subsequent find() rejects with /not available.*failed/.
failAll() previously only set readyState to 'failed' when the worker crashed during init (state 'pending'). A crash after successful init left the handle in 'ready' state, causing subsequent find() calls to silently return empty results via the catch handler. Now failAll() unconditionally marks the handle as 'failed' (unless already disposed), and find() rejects immediately on a failed handle. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DragonnZhang
left a comment
There was a problem hiding this comment.
Review Summary — PR #4621
perf(filesearch): move AsyncFzf index construction to a worker thread
This PR moves the synchronous new AsyncFzf(allFiles, ...) constructor off the main thread into a worker_threads worker, eliminating the Ink render-loop freeze on workspaces with >20k files. The architecture is clean: a 4-message RPC protocol between worker and main thread, with an in-process fallback for small file counts (<5k) where worker spawn overhead would exceed the construction cost.
Verification Results
- CI: All checks pass (Lint, CodeQL, Tests on macOS/Ubuntu/Windows)
- Tests: 42/42 filesearch tests pass (27 existing + 2 new dispose tests + 13 FzfWorkerHandle tests including 7 real worker_threads scenarios)
- Lint: Clean on all 6 changed source files
- Existing review comments: 14 issues raised by prior reviewers (wenshao, DragonnZhang) across 3 rounds of feedback — all addressed in commits 81a513d, f3b9666, 0dae073, 769a236
Analysis
The prior reviewers were extremely thorough, covering: worker thread lifecycle leaks, init failure handling, IPC payload optimization, timer cleanup, race conditions in concurrent initialize() calls, React effect cleanup semantics, test coverage gaps, and .gitignore hygiene. The author addressed every finding with appropriate fixes.
My independent analysis of the current codebase confirmed:
cancelledflag correctly guards against stale async completions inuseAtCompletion.tsfailAll()now unconditionally marks the handle as'failed'on post-init crashesFzfWorkerHandle.create()properly disposes on init failure and cleans up the timeout timerRecursiveFileSearch.dispose()is idempotent and safely handles undefinedfzf- The
[cwd, config]effect correctly separates resource disposal from per-keystroke effect cleanup setupFileWatchers()Disposable now iterates and disposes all cachedFileSearchinstances
No new high-confidence issues found beyond what prior reviewers already identified and the author fixed.
Verdict: APPROVE
Local runtime verification (tmux-driven, bundled CLI)Ran the PR's actual surface — the bundled CLI's Environment: Apple M5 Pro, macOS 26.5, Node v22.22.2. VerdictMechanically: everything checks out. Worker is spawned exactly when promised, results are correct, the fallback is robust, lifecycle is clean, packaging is right. Performance premise: not reproducible on this machine. Current Scenario matrix (echo latency while typing through init + search)
"Worker used" was proven by instrumenting Edge probes (all on PR build, 80k tree)
Final pane, PR build, 80k tree (typing stayed responsive throughout):
|
| input | ctor time |
|---|---|
| 80k × 35-char paths | 26 ms |
| 100k × 35-char | 28 ms |
| 100k × 120-char | 61 ms |
| 100k × node_modules-style deep paths | 57 ms |
| 100k × CJK filenames | 50 ms |
The PR cites "~30 ms at ~5k files" from fzf-bench, which would extrapolate to ~600 ms at 100k — my measurements come in ~20× under that. Even granting several-fold slower hardware, the ctor stays sub-frame-budget-×-a-few at the cap.
2. The v1/v2 switch can't affect ctor cost, and find() is already non-blocking by design. In fzf's BaseFinder the constructor work (strToRunes over all items) is identical for v1/v2 — the option only selects algoFn. And AsyncFzf.find() runs through asyncMatcher, which processes 1000 items per chunk with setImmediate yields between chunks — at 80k files that's ~1.6 ms of main-thread work per chunk. That's why main measures flat ≤39 ms echo latency: the "Async" in AsyncFzf is doing its job. (Total find() cost is real — 86–184 ms per keystroke at 15k–80k files in isolation — but it's sliced, not a freeze.)
The genuinely sync freeze (the original issue) was the fdir crawl, which #3214 already fixed. After #3214, on this hardware, I could not produce any input stall on main with the biggest tree the crawler will accept.
Other findings
environmentContext.test.tsin this PR is now byte-identical to currentmain(main fixed the same missing-braces issue independently) — that commit becomes a no-op on merge, no conflict.- PR body says "7 files, +650/−74"; the branch is actually 11 files, +958/−76. Stale description, worth refreshing.
- Not exercised here: the
vscode-ide-companiondispose-on-evict path (needs a VS Code harness), worker-crash-mid-find at the surface (covered by the 5 new unit tests), Windows, and genuinely slow hardware.
Suggestion for the merge decision
The implementation quality is good — fallback, threshold, dispose, unref, packaging all verified working at the real surface. What's missing is evidence the disease exists post-#3214: @callmeYe, can you reproduce a measurable input freeze on current main (exact machine, OS, Node version, file count, and how it was measured)? If yes, this merges on solid ground. If the freeze only reproduces on the pre-#3214 build the original issue was filed against, then this PR buys a second build artifact + worker lifecycle complexity for headroom rather than a fix, and that's a different (still discussable) trade.
|
LGTM |
TLDR
Hosts the @-picker's
AsyncFzfinstance in aworker_threadsworker sothe freeze on workspaces with >20k files is removed. Narrow follow-up to
#3455 (which I'm closing); leaves the crawler entirely to upstream's
design from #3214.
Why
The original freeze had two root causes:
fdircrawl on the main thread.new AsyncFzf(allFiles, ...)constructor — misleadinglynamed; the constructor itself runs sync and dominates main-thread cost
on big trees, especially after the v2→v1 algorithm switch at 20k files.
Cause 1 was already addressed by #3214 (
feat(core): replace fdir crawler with git ls-files + ripgrep fallback) — the new 3-tier cascade isspawn-based and async. Cause 2 still freezes the Ink render loop on
large workspaces. This PR moves that construction off the main thread.
I closed #3455 because its surface area (worker, ripgrep crawler, prewarm,
IDE companion changes, UX tweaks; ~16 files) drifted too far from
upstream after 451 commits — the merge had 11 file conflicts with 4
double-rewrites. This PR is intentionally narrow: 7 files, ~575 net
insertions.
Architecture
packages/core/src/utils/filesearch/fzfWorker.ts— worker entry. Ownsthe
AsyncFzf<string[]>instance, exposes a four-message RPC(
init→ready/init-error;find→result/find-error;dispose).packages/core/src/utils/filesearch/fzfWorkerHandle.ts— main-threadproxy. Same
find()shape asAsyncFzf<string[]>, swap is one line infileSearch.ts. Handles transport selection: in-thread fallback below~5k files (where worker spawn + IPC outweighs construction cost),
worker above. Awaits
readyinsidecreate()so init errors surfaceearly rather than as a first-find surprise. Worker is
unref()-ed so amid-find @-picker doesn't pin process exit.
packages/core/src/utils/filesearch/fileSearch.ts— one-line swap ofthe
new AsyncFzfcall site, plus an optionaldispose?(): Promise<void>on the
FileSearchinterface so consumers can release workerresources.
packages/vscode-ide-companion/src/webview/handlers/FileMessageHandler.ts—clearFileSearchCachenow callsdispose?()on the previous instanceso the long-running VS Code extension host doesn't accumulate workers
across watcher-triggered evictions.
esbuild.config.js— adds a second esbuild build that emits aself-contained
dist/fzfWorker.js(fzf inlined). Main bundle remainsthe splitting build introduced upstream.
scripts/prepare-package.js— whitelistsfzfWorker.jsin the disttarball.
Why no
whenReady()/ no in-worker crawlThese were the two open critical reviews on #3455 (commit
44fcc11dc,@wenshao 04-24 / 05-09). Both become non-issues by design:
whenReady()post-dispose hang → there is nowhenReady(). Thehandle awaits
readyinside itscreate()factory, so callersobserve init success/failure at construction time. Post-dispose
find()rejects via the per-call pending map'sfailAll.AbortSignalnot threaded through in-process crawl → there is noin-worker crawl. The crawl stays on the main thread (already async via
upstream's spawn-based cascade); only the AsyncFzf construction
moves. Search-time abort is unchanged from main.
Test / dev mode
installInProcessFzfTransport()forces the in-thread path for testsand sandboxes that can't spawn workers. It returns a restorer the
caller MUST run in
afterAll— installing globally intest-setup.tscaused the eager-fs-mock issue @wenshao caught on #3455's earlier
attempt, so this PR does NOT do that.
__setWorkerThresholdForTestsis a separate hook for forcing worker selection on tiny inputs.
npm run dev(tsx) doesn't have the bundleddist/fzfWorker.jsondisk; the handle's
existsSynccheck silently falls back to thein-thread path. Acceptable trade-off — devs running tsx against huge
workspaces will see the same freeze the bundled CLI avoids.
Reviewer Test Plan
npm ci && npm run build && npm run bundle— confirmdist/fzfWorker.jsis emitted alongsidedist/cli.js.npm run test --workspace=packages/core— 104/104 filesearch testspass, including 5 new
fzfWorkerHandle.test.tscases.QWEN_WORKING_DIR=~ npm start(or any tree with >20k files);press
@— Ink input cursor stays responsive during fzf init.Compare to
main: visible freeze.@— should still openinstantly (in-thread path, no worker spawn cost).
Diff size
7 files, +650 / −74. Roughly half of the +650 is the new
fzfWorkerHandle.tsand the worker entry; the rest is the esbuildconfig restructure and the test file.
Linked
crawler.