fix: chain reloadForScope when scope/commit changes mid-flight#482
Merged
tomasz-tomczyk merged 5 commits intomainfrom May 7, 2026
Merged
fix: chain reloadForScope when scope/commit changes mid-flight#482tomasz-tomczyk merged 5 commits intomainfrom
tomasz-tomczyk merged 5 commits intomainfrom
Conversation
The dedup guard returned the in-flight reload promise unconditionally, so a click that swapped scope while a previous reload was still running awaited the OLD scope's promise. The new scope's /api/session was never fetched, leaving the page rendered with the previous scope's files but the new toggle visually active. This was a Windows-only e2e flake because Linux file I/O finishes loadAllFileData() faster than the user can click the next toggle in tests. On Windows runners the reload outlasts the next click handler, so the race fires deterministically. Different tests fail across runs (commit-selection / round-complete / threading) — anything that triggers a reload then immediately follows up with another state change. Fix: track (scope, commit) per in-flight reload. Same key → dedup (existing behaviour, collapses double-triggers like SSE base-changed landing during a manual click). Different key → chain after the current reload finishes via .then(reloadForScope), so each distinct request actually runs.
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (26.66%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #482 +/- ##
==========================================
- Coverage 69.27% 68.94% -0.34%
==========================================
Files 43 45 +2
Lines 10817 10902 +85
==========================================
+ Hits 7494 7516 +22
- Misses 2755 2818 +63
Partials 568 568
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
TestClientExitsOnFinish only killed the client process in t.Cleanup, but crit's daemon is intentionally detached and survives the client's exit. On Windows the surviving daemon's cwd handle on repoDir caused t.TempDir RemoveAll to fail with "file in use by another process" — the test would pass but Go reported the cleanup error and failed the run. Read the daemon PID from the session file (waitForDaemonPort already parses it; rename to waitForDaemonSession and return the full entry), then terminate it in t.Cleanup using the same terminate→poll→Kill escalation stopDaemon already uses. Once the daemon process is gone its file handles are released and RemoveAll succeeds. POSIX would have ignored the open handles and unlinked anyway; this manifested only on Windows.
Killing the daemon and polling processExists wasn't enough — Windows can delay releasing the cwd directory handle for tens of milliseconds after the process object reports exit, so t.TempDir's RemoveAll still hit "file in use by another process". Add a best-effort retry loop (40 × 50ms = up to 2s) on resolvedRepo inside the cleanup, gated to windows. Runs before t.TempDir's own cleanup via LIFO, so by the time the framework cleans up there's nothing left to delete and the spurious error is gone. POSIX is unaffected: it ignores open handles when unlinking.
terminateProcess on the daemon PID alone wasn't enough on Windows: crit's daemon polls `git status` once per second from watch.go and spawns git/gh/sl subprocesses during diff/PR work. Each child inherits repoDir as its cwd and holds an independent directory handle. Windows' TerminateProcess does not cascade to descendants, so any in-flight git child survived, kept the cwd locked, and t.TempDir's RemoveAll still hit "file in use by another process" — a 2s retry loop wasn't enough because a fresh git child can spawn every second the daemon is alive. Wrap the spawned crit (and therefore its whole tree) in a Job Object with JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE. Start the parent suspended, assign to the job, then NtResumeProcess — closes the spawn-then-assign race in which a child could escape the job. TerminateJobObject in cleanup brings the entire tree down atomically; the existing per-PID terminate path is kept as a safety net for the daemon itself. Unix is a no-op (SIGTERM to the daemon is sufficient there, and Unix doesn't hold cwd handles open the way Windows does). Drop the now-redundant 40×50ms RemoveAll retry loop — the Job Object kill is synchronous and complete.
SignalRoundComplete clears f.Comments synchronously and signals the watcher to process the round; the watcher then runs git status, loads disk, and calls carryForwardFileComments. carryForwardFileComments unconditionally cleared f.Comments before re-appending carried-forward PreviousComments, so any comment that landed between SignalRoundComplete and the watcher's run was wiped out. The race window is microseconds on Linux/macOS but widens on Windows because git status and file I/O take noticeably longer there. Symptom: agent posts a comment immediately after /api/round-complete returns, comment briefly appears, then disappears when the watcher finally processes the round; on the next finish-review the no-changes overlay fires because only carried-forward comments remain. The e2e-windows flake at round-complete.spec.ts:732 was the visible tip. Filter f.Comments to preserve entries whose ID is not in PreviousComments — those are brand-new for this round, not carry-forward targets. carryForwardAllComments had a parallel bug: the len(f.Comments) > 0 skip silently dropped PreviousComments for files that only contained race-window-added comments. Replace the count check with a CarriedForward marker check. This is a real backend correctness bug, not a test-only fix; surfacing it as Windows-only flakiness was an artifact of the platform's I/O latency, not the symptom's cause.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
reloadInFlightdedup guard inreloadForScope()returned the in-flight promise unconditionally, so a click that swapped scope while a previous reload was still running awaited the OLD scope's promise. The new scope's/api/sessionwas never fetched — page stayed on the previous scope's files while the toggle visually flipped.loadAllFileData()faster than the next click, so the dedup branch was rarely hit. Windows runners are slow enough that the reload outlasts the next click handler, and the race fired deterministically — which is why different tests failed across runs (commit-selection,round-complete,threading).(scope, commit)per in-flight reload. Same key → dedup (existing behaviour, collapses double-triggers like SSEbase-changedlanding during a manual click). Different key → chain after the current reload finishes via.then(reloadForScope, reloadForScope). The rejection arm prevents a failed reload from poisoning subsequent scope changes.Review
finallyruns againstreloadInFlight === null, so chained calls re-readdiffScope/diffCommitand dispatch correctly)crit-web/doesn't have scope toggles (single shared review viewer)Test plan
commit-selection.spec.tsruns locally against the freshly built binary: 13/13 pass on macOS including the previously-flaky:111:7"commit picker reappears when switching back to All scope".gofmt,go vet,golangci-lintall clean (no Go changes).e2e-windowshas been red onmainsince feat: add Windows + WSL support #459 landed; this PR should turn it green.🤖 Generated with Claude Code