fix(config): resolve data race in config_watcher inotify fd access#398
Conversation
Make inotify_fd_ and watch_fd_ atomic integers on Linux to eliminate the data race between cleanup_inotify() and watch_loop_linux(). This follows the same pattern already used for kqueue_fd_ on macOS/BSD: - Use std::atomic<int> for file descriptors shared across threads - Use atomic exchange in cleanup to safely invalidate descriptors - Use atomic load in watch loop to read current descriptor values Remove the corresponding TSan suppression as the root cause is fixed. Closes #397
CI/CD Failure AnalysisAnalysis Time: 2026-03-06 12:55 UTC Failed Workflows
Root Cause AnalysisPrimary Error: Analysis: The root cause is in Identified Issues:
Proposed Fix
Next Steps
Automated failure analysis - Attempt #1 |
The previous atomic fix addressed the integer variable race but not the underlying file descriptor race: close() on the main thread could race with read() on the watcher thread operating on the same fd value. Fix by restructuring the shutdown sequence: - Add eventfd for clean shutdown signaling on Linux - Poll both inotify fd and shutdown fd in watch loop - Reorder stop(): signal -> join -> cleanup (close fds only after the watcher thread has exited) This eliminates the TSan-detected race between close() and read() on the inotify file descriptor.
CI/CD Fix Verification - Attempt #1 ResultStatus: All checks passing Fix AppliedThe eventfd-based shutdown signaling approach successfully resolved the TSan data race:
CI Results
All 27 checks passing. Ready for review. Automated fix verification - Attempt #1 of 3 (resolved) |
) * fix(config): resolve data race in config_watcher inotify fd access Make inotify_fd_ and watch_fd_ atomic integers on Linux to eliminate the data race between cleanup_inotify() and watch_loop_linux(). This follows the same pattern already used for kqueue_fd_ on macOS/BSD: - Use std::atomic<int> for file descriptors shared across threads - Use atomic exchange in cleanup to safely invalidate descriptors - Use atomic load in watch loop to read current descriptor values Remove the corresponding TSan suppression as the root cause is fixed. Closes #397 * fix(config): use eventfd for shutdown signaling to avoid fd race The previous atomic fix addressed the integer variable race but not the underlying file descriptor race: close() on the main thread could race with read() on the watcher thread operating on the same fd value. Fix by restructuring the shutdown sequence: - Add eventfd for clean shutdown signaling on Linux - Poll both inotify fd and shutdown fd in watch loop - Reorder stop(): signal -> join -> cleanup (close fds only after the watcher thread has exited) This eliminates the TSan-detected race between close() and read() on the inotify file descriptor.
Closes #397
Summary
inotify_fd_andwatch_fd_std::atomic<int>on Linux to eliminate the data race betweencleanup_inotify()(main thread) andwatch_loop_linux()(watcher thread)kqueue_fd_on macOS/BSDconfig_watcheras the root cause is now fixedChanges
include/kcenon/common/config/config_watcher.h: Convert plainintfd members tostd::atomic<int>, useexchange(-1)in cleanup andload()in watch loopsanitizers/tsan_suppressions.txt: Removeconfig_watchersuppression entriesTest Plan
common_config_watcher_testpasses