Skip to content

fix(desktop): retry atomic writes past transient Windows locks#4193

Merged
esengine merged 1 commit into
main-v2from
fix/windows-atomic-rename-retry
Jun 12, 2026
Merged

fix(desktop): retry atomic writes past transient Windows locks#4193
esengine merged 1 commit into
main-v2from
fix/windows-atomic-rename-retry

Conversation

@esengine

Copy link
Copy Markdown
Owner

Symptom

A recurring crash group, four different Windows users in one day (surfaced by the new dashboard summary column):

[unhandledrejection]
rename D:\...\.reasonix\desktop-topic-title-sources.json.tmp …\desktop-topic-title-sources.json: Access is denied.

Cause

Antivirus, the Windows Search indexer, or a second instance holds the destination JSON for a few hundred milliseconds. MoveFileEx (Go's os.Rename) then fails with ERROR_ACCESS_DENIED, and because the desktop's cosmetic-state writers did a raw os.Rename(tmp, path) — bypassing fileutil.ReplaceFile and with no retry — the error propagated up a bound method and reached the frontend as an uncaught promise rejection, painting the full crash overlay over a cosmetic title-cache write.

Fix

  • fileutil.ReplaceFile: retry the replace a few times with a short escalating backoff while the tmp source still exists — a missing tmp means the write itself failed, so it fails fast with no pointless retry. The EXDEV copy fallback is unchanged. Every existing caller (branch metadata, session titles, config, dotenv, acp) gains the same Windows robustness.
  • desktop/tabs.go: route the six raw os.Rename(tmp, path) writers (tabs, projects, topic titles / sources / created-ats, telemetry snapshot) through ReplaceFile.

Verification

  • go test ./internal/fileutil — new tests cover fast-fail on missing tmp and retry-then-return-error when the destination can never be replaced (tmp preserved for the next attempt).
  • go test ./desktop -run 'Topic|Tab|Telemetry|Title' and go vet ./desktop — green.

A truly permanent FS error (e.g. a read-only .reasonix) still surfaces; that's a broken environment, not the transient lock these reports show. Catching the rejection frontend-side so even that degrades gracefully is a reasonable follow-up.

Four users crashed the same day on `rename ...desktop-topic-title-sources.json.tmp: Access is denied` — antivirus / the search indexer / a second instance briefly locks the destination JSON, MoveFileEx fails, and the error reached the frontend as an uncaught [unhandledrejection] that paints the crash overlay.

The desktop's cosmetic-state writers (tabs, projects, topic titles/sources/created-ats, telemetry) did a raw os.Rename(tmp, path) with no retry, bypassing fileutil.ReplaceFile. Harden ReplaceFile to retry with a short backoff while the tmp source survives — a missing tmp means the write itself failed, so no retry can help — keep the EXDEV copy fallback, and route the six sites through it. Branch-meta / session / config writers already on ReplaceFile get the same robustness for free.
@esengine esengine requested a review from SivanCola as a code owner June 12, 2026 13:32
@github-actions github-actions Bot added v2 Go rewrite (1.x) — main-v2 branch, active development desktop Wails desktop app (desktop/**) labels Jun 12, 2026
Comment thread internal/fileutil/atomicwrite.go Dismissed
@esengine esengine merged commit 1b57e0c into main-v2 Jun 12, 2026
14 checks passed
@esengine esengine deleted the fix/windows-atomic-rename-retry branch June 12, 2026 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

desktop Wails desktop app (desktop/**) v2 Go rewrite (1.x) — main-v2 branch, active development

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants