Skip to content

Commit f5e567e

Browse files
jayzalowitzclaude
andcommitted
fix(#187 AC#2 r3 follow-up): install idempotency + legacy partial migration
Three findings from Copilot's review of PR #249: - **Install destroys good shared file on rename failure**: the previous "unlink target → renameSync" sequence had a window where, if rename threw or got cancelled mid-flight, the host was left with no GGUF at all — even though a perfectly good copy had been deleted moments before. Since the final path is now content-addressable (SHA-256 verified before this point), an existing target IS the model. New install path: if target exists → just discard our partial, treat as installed; otherwise renameSync directly without unlinking; if the rename fails AND target now exists (race lost to another row), treat as installed. - **Legacy non-namespaced partials orphaned by migration**: PR #247 shipped with `<target>.partial`; PR #249 namespaces by row id. Existing pre-#249 partials would sit on disk indefinitely. Added unconditional cleanup at the top of runDownload — if a legacy `<target>.partial` exists, delete it before the row's namespaced partial is created. - **Stale comment in cancelDownload**: docstring still mentioned `<target_path>.partial`. Updated to the namespaced format. Build clean. 25 route tests still passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 4e9d5f8 commit f5e567e

1 file changed

Lines changed: 40 additions & 13 deletions

File tree

apps/api/src/embedded-llm/downloader.ts

Lines changed: 40 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -182,8 +182,9 @@ export async function pauseDownload(downloadId: string): Promise<boolean> {
182182
}
183183

184184
/**
185-
* Cancel and clean up. Aborts in-flight transfer, deletes the
186-
* `<target_path>.partial` file, marks the row 'cancelled'.
185+
* Cancel and clean up. Aborts in-flight transfer, deletes this row's
186+
* partial (`<target_path>.<downloadId>.partial`), marks the row
187+
* 'cancelled'. The shared final GGUF at `<target_path>` is left alone.
187188
*/
188189
export async function cancelDownload(downloadId: string): Promise<boolean> {
189190
const row = await modelDownloadRepository.findById(downloadId);
@@ -242,6 +243,15 @@ async function runDownload(download: ModelDownloadRow): Promise<void> {
242243
}
243244

244245
const partialPath = partialPathFor(download.target_path, download.id);
246+
// Migration: PR #247 wrote to a non-namespaced `<target>.partial`.
247+
// After this PR, partials are namespaced by download row id. Any
248+
// pre-namespacing partial on disk is orphan junk from before the
249+
// upgrade — clean it up unconditionally so it doesn't waste space
250+
// forever.
251+
const legacyPartial = `${download.target_path}.partial`;
252+
if (existsSync(legacyPartial)) {
253+
try { unlinkSync(legacyPartial); } catch { /* best effort */ }
254+
}
245255
// Only resume from the partial if THIS row had progress recorded.
246256
// A fresh row (bytes_downloaded === 0) finding a partial on disk
247257
// means the previous attempt didn't clean up — e.g., a cancel that
@@ -414,17 +424,34 @@ async function runDownload(download: ModelDownloadRow): Promise<void> {
414424
return;
415425
}
416426
await modelDownloadRepository.setStatus(download.id, 'installing');
417-
// Atomic rename — partial → final. Same filesystem so this is O(1).
418-
try {
419-
if (existsSync(download.target_path)) unlinkSync(download.target_path);
420-
renameSync(partialPath, download.target_path);
421-
} catch (err) {
422-
inFlight.delete(download.id);
423-
if (handle.cancelled) return;
424-
await modelDownloadRepository.setStatus(download.id, 'failed', {
425-
error: `install (rename) failed: ${err instanceof Error ? err.message : String(err)}`,
426-
});
427-
return;
427+
// The final target is shared across users (content-addressable —
428+
// we just verified SHA-256 above, so any existing file at this
429+
// path is by definition the same model). If another download
430+
// already installed it (concurrent same-model fetch from a
431+
// different user/row), skip the rename entirely and just discard
432+
// our partial. The previous "unlink target → rename" sequence had
433+
// a window where a renameSync failure could leave the host with
434+
// no GGUF after we'd already deleted the good one.
435+
if (existsSync(download.target_path)) {
436+
try { unlinkSync(partialPath); } catch { /* best effort */ }
437+
} else {
438+
try {
439+
renameSync(partialPath, download.target_path);
440+
} catch (err) {
441+
// Race fallback: someone else won between our existsSync and
442+
// our rename. If the target now exists, treat as installed —
443+
// otherwise this is a real install failure.
444+
if (existsSync(download.target_path)) {
445+
try { unlinkSync(partialPath); } catch { /* best effort */ }
446+
} else {
447+
inFlight.delete(download.id);
448+
if (handle.cancelled) return;
449+
await modelDownloadRepository.setStatus(download.id, 'failed', {
450+
error: `install (rename) failed: ${err instanceof Error ? err.message : String(err)}`,
451+
});
452+
return;
453+
}
454+
}
428455
}
429456

430457
inFlight.delete(download.id);

0 commit comments

Comments
 (0)