fix(linux): fix race conditions and deadlocks in event handling#5192
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughSwitches mainthread callback dispatch from read locks to exclusive locks across platforms, adds a test verifying DispatchWailsEvent releases the app lock before per-window dispatch, and includes a regression app, assets, Taskfile, and changelog entry documenting the fixes. ChangesMainthread Event Emission Race Condition Fix
Sequence Diagram(s)sequenceDiagram
participant EventEmitter
participant MainThreadDispatcher
participant CallbackStore
participant MenuHandler
EventEmitter->>MainThreadDispatcher: Emit event
MainThreadDispatcher->>CallbackStore: Lock (exclusive) and lookup callbackID
CallbackStore-->>MainThreadDispatcher: return callback fn / nil
MainThreadDispatcher->>CallbackStore: Delete callbackID
MainThreadDispatcher->>CallbackStore: Unlock
MainThreadDispatcher->>MainThreadDispatcher: Invoke callback (fn())
MenuHandler->>MainThreadDispatcher: Menu click handling (may acquire windowsLock)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Suggested Labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Addresses issues that could cause menu buttons to stop responding when events are emitted frequently: Fix race condition in mainthread callback store (Linux, macOS, iOS): - Change RLock() to Lock() when deleting from mainThreadFunctionStore. Using RLock for the read but then calling delete() violated the RWMutex contract and could cause undefined behaviour under concurrent access. The related EventIPCTransport.DispatchWailsEvent deadlock identified during this investigation was independently fixed on master in #5107. Adds regression tests and a manual reproduction app under v3/test/4424-event-block/.
6b5193b to
185b0b2
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
v3/pkg/application/transport_event_ipc_test.go (1)
80-103: ⚡ Quick winBroaden this regression test to exercise the actual window dispatch path.
With
windowsinitialized empty, this can pass without touching the potentially blockingExecJSbranch tied to#4424. Add a stub window and assert lock behavior while dispatch is in-flight.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@v3/pkg/application/transport_event_ipc_test.go` around lines 80 - 103, The test currently never exercises the window dispatch path because app.windows is empty; modify TestDispatchWailsEventReleasesLockAfterCompletion to insert a stub Window into app.windows (e.g., id 1) that implements the ExecJS method and blocks on a channel so DispatchWailsEvent goes through the ExecJS branch. Start DispatchWailsEvent in a goroutine, assert that while the stub's ExecJS is blocked the windowsLock is still held (try to acquire it and expect to block), then unblock the stub ExecJS channel and finally assert windowsLock is released (acquire succeeds within timeout). Reference symbols: TestDispatchWailsEventReleasesLockAfterCompletion, App.windows, App.windowsLock, EventIPCTransport.DispatchWailsEvent, CustomEvent, and the window stub's ExecJS.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@v3/pkg/application/transport_event_ipc_test.go`:
- Around line 80-103: The test currently never exercises the window dispatch
path because app.windows is empty; modify
TestDispatchWailsEventReleasesLockAfterCompletion to insert a stub Window into
app.windows (e.g., id 1) that implements the ExecJS method and blocks on a
channel so DispatchWailsEvent goes through the ExecJS branch. Start
DispatchWailsEvent in a goroutine, assert that while the stub's ExecJS is
blocked the windowsLock is still held (try to acquire it and expect to block),
then unblock the stub ExecJS channel and finally assert windowsLock is released
(acquire succeeds within timeout). Reference symbols:
TestDispatchWailsEventReleasesLockAfterCompletion, App.windows, App.windowsLock,
EventIPCTransport.DispatchWailsEvent, CustomEvent, and the window stub's ExecJS.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ba5ea097-d2c3-4289-a128-98a0760bf85d
📒 Files selected for processing (7)
v3/UNRELEASED_CHANGELOG.mdv3/pkg/application/mainthread_darwin.gov3/pkg/application/mainthread_ios.gov3/pkg/application/mainthread_linux.gov3/pkg/application/transport_event_ipc_test.gov3/test/4424-event-block/Taskfile.ymlv3/test/4424-event-block/main.go
There was a problem hiding this comment.
Pull request overview
Fixes concurrency issues in v3 event/main-thread handling, especially around callback-store mutation and event dispatch locks for #4424.
Changes:
- Switches main-thread callback deletion on Linux, macOS, and iOS from
RLocktoLock. - Updates event IPC dispatch to avoid holding
windowsLockwhile dispatching to windows. - Adds unit/manual tests and changelog entries for the race/deadlock fixes.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
v3/UNRELEASED_CHANGELOG.md |
Adds unreleased notes for #4424 and related fixes. |
v3/pkg/application/transport_event_ipc.go |
Snapshots windows before dispatch to avoid lock retention during ExecJS. |
v3/pkg/application/mainthread_linux.go |
Uses write locking when reading/deleting callback-store entries. |
v3/pkg/application/mainthread_darwin.go |
Uses write locking when reading/deleting callback-store entries. |
v3/pkg/application/mainthread_ios.go |
Uses write locking when reading/deleting callback-store entries. |
v3/pkg/application/transport_event_ipc_test.go |
Adds concurrency-focused tests for callback-store and dispatch locking. |
v3/test/4424-event-block/Taskfile.yml |
Adds a task to run the manual repro app. |
v3/test/4424-event-block/main.go |
Adds a manual app intended to reproduce event/menu blocking. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func TestDispatchWailsEventReleasesLockAfterCompletion(t *testing.T) { | ||
| app := &App{ | ||
| windows: make(map[uint]Window), | ||
| windowsLock: sync.RWMutex{}, | ||
| } | ||
|
|
||
| transport := &EventIPCTransport{app: app} | ||
| event := &CustomEvent{Name: "test"} | ||
|
|
||
| transport.DispatchWailsEvent(event) | ||
|
|
||
| acquired := make(chan struct{}) | ||
| go func() { | ||
| app.windowsLock.Lock() | ||
| close(acquired) | ||
| app.windowsLock.Unlock() | ||
| }() | ||
|
|
||
| select { | ||
| case <-acquired: | ||
| case <-time.After(time.Second): | ||
| t.Fatal("windowsLock should be released after DispatchWailsEvent completes") | ||
| } |
| app.Event.Emit("backendmessage", map[string]interface{}{ | ||
| "counter": counter, | ||
| "time": time.Now().Format(time.RFC3339), | ||
| }) |
| var lock sync.RWMutex | ||
| store := make(map[uint]func()) | ||
| var wg sync.WaitGroup | ||
|
|
||
| const writers = 50 | ||
| const readers = 50 | ||
|
|
||
| for i := 0; i < writers; i++ { | ||
| wg.Add(1) | ||
| go func(id uint) { | ||
| defer wg.Done() | ||
| lock.Lock() | ||
| store[id] = func() {} | ||
| lock.Unlock() | ||
|
|
||
| lock.Lock() | ||
| delete(store, id) | ||
| lock.Unlock() | ||
| }(uint(i)) | ||
| } | ||
|
|
||
| for i := 0; i < readers; i++ { | ||
| wg.Add(1) | ||
| go func() { | ||
| defer wg.Done() | ||
| lock.RLock() | ||
| _ = len(store) | ||
| lock.RUnlock() | ||
| }() | ||
| } | ||
|
|
||
| wg.Wait() |
| - Fix window menu crash on Wayland caused by appmenu-gtk-module accessing unrealized window (#4769) by @leaanthony | ||
| - Fix GTK application crash when app name contains invalid characters (spaces, parentheses, etc.) by @leaanthony | ||
| - Fix "not enough memory" error when initializing drag and drop on Windows (#4701) by @overlordtm | ||
| - Fix race condition in mainthread callback store using incorrect RLock for map deletion (Linux, macOS, iOS) (#4424) by @leaanthony |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
- mainthread_android.go: apply the same RLock -> Lock change made to the linux/macOS/iOS callback paths. Android uses the same shared mainThreadFunctionStore, so it had the identical map-write under RLock pattern. - transport_event_ipc_test: the previous lock-release test used an empty windows map and so would also have passed against the broken (lock-held-across-dispatch) implementation. Replace it with a probe Window whose DispatchWailsEvent attempts windowsLock.TryLock(); the fixed snapshot-and-release path lets the TryLock succeed, a regression to holding RLock across per-window dispatch fails it. Verified by mutation-test against the regressed implementation. - test/4424-event-block: the manual repro had no webview window and so EventIPCTransport.DispatchWailsEvent's per-window loop was a no-op, making the harness unable to reach the ExecJS path involved in #4424. Add a small webview window that subscribes to the emitted event, bump the emit cadence to 100ms so menu clicks have a high probability of landing during a dispatch, and document the manual pass/fail criteria in the file header.
Summary
RLockused for map deletion inmainthread_linux.go,mainthread_darwin.go, andmainthread_ios.go— map writes underRLockcause undefined behavior with concurrent readersDispatchWailsEventholdingwindowsLockduring potentially blockingExecJSdispatch — release lock before dispatching to prevent deadlocks when the main thread also needs the lockFatalin nil callback path to avoid lock leakFixes #4424
Test plan
TestMainThreadFunctionStoreLockCorrectness— concurrent callback store access with Lock (not RLock)TestMainThreadFunctionStoreConcurrentReadWrite— concurrent readers and writersTestDispatchWailsEventReleasesLockAfterCompletion— lock is released after dispatchv3/test/4424-event-block/main.go— manual test with event emitter + menu clicksSummary by CodeRabbit
New Features
Bug Fixes
Tests