Skip to content

fix(NotificationCenter): lock order inversion deadlock #5159#5160

Merged
matejk merged 18 commits intomainfrom
5159-nc-deadlock
Jan 15, 2026
Merged

fix(NotificationCenter): lock order inversion deadlock #5159#5160
matejk merged 18 commits intomainfrom
5159-nc-deadlock

Conversation

@aleks-f
Copy link
Copy Markdown
Member

@aleks-f aleks-f commented Jan 10, 2026

Branch Summary: 5159-nc-deadlock

This branch fixes several thread-safety issues and deadlocks reported in issues #5159, #5161, and #5162.

Closes #5159, #5161, and #5162.

Issues Addressed

New Components

IOLock (Foundation/include/Poco/IOLock.h, Foundation/src/IOLock.cpp)

A new lock-free synchronization primitive for coordinating I/O operations with close/shutdown:

  • IOLock - tracks whether an I/O operation is in progress and whether close has been requested
  • ScopedIOLock - RAII wrapper that enters on construction and leaves on destruction
  • Uses std::atomic<bool> for _closed and _inProgress flags
  • Provides markClosed(), wait(), tryWait(), and close() methods

ScopedSocketReactor (Net/include/Poco/Net/SocketReactor.h)

A new RAII wrapper utility class that starts a SocketReactor in a background thread and automatically stops/joins on destruction:

ScopedSocketReactor reactor;
reactor->addEventHandler(socket, observer);
// reactor runs in background thread
// automatically stopped and joined when reactor goes out of scope

Bug Fixes

NotificationCenter (#5159)

  • Fixed lock order inversion deadlock in observer notification

Pipe (#5161)

  • Added IOLock synchronization for thread-safe read/write operations
  • Fixed close ordering: write end must be closed before read end to send EOF to blocked readers
  • Fixed race condition: wait() for I/O completion now happens before closing file descriptors
  • Affects both POSIX and Windows implementations

PollSet (#5161)

  • Added IOLock to coordinate poll() with destructor
  • Prevents use-after-free when poll() is running during destruction
  • Exit early on poll error when closed
  • Affects epoll, poll, and select implementations

Test Improvements

New Tests

  • PipeTest - comprehensive test suite including testCloseRace for concurrent close/read
  • testConcurrentHandlerRemoval in SocketReactorTest - tests concurrent handler removal

CI Enhancements

  • Added sanitized builds (ASan, TSan, UBSan) to CI workflow

Files Changed

Component Files
IOLock IOLock.h, IOLock.cpp (new)
Pipe PipeImpl_POSIX.h/cpp, PipeImpl_WIN32.h/cpp, Pipe.cpp
PollSet PollSet.cpp
SocketReactor SocketReactor.h, SocketReactor.cpp
Tests PipeTest.cpp/h, SocketReactorTest.cpp/h, NotificationCenterTest.cpp/h
CI .github/workflows/ci.yml

Key Code Changes

Pipe Close Sequence (Before)

void PipeImpl::closeRead() {
    if (_readfd != -1) close(fd);      // Race: close while read() in progress
}

Pipe Close Sequence (After)

void PipeImpl::closeRead() {
    _readLock.markClosed();
    _readLock.wait();              // Wait for any in-progress read to finish
    int fd = _readfd.exchange(-1);
    if (fd != -1) close(fd);       // Now safe to close
}

Destructor Close Order (Before)

~PipeImpl() {
    closeRead();   // Wrong: blocks readers waiting for data
    closeWrite();
}

Destructor Close Order (After)

~PipeImpl() {
    closeWrite();  // Correct: sends EOF to unblock readers
    closeRead();
}

@aleks-f aleks-f requested a review from matejk January 10, 2026 23:27
@aleks-f aleks-f linked an issue Jan 10, 2026 that may be closed by this pull request
@aleks-f aleks-f changed the title fic(NotificationCenter): lock order inversion deadlock #5159 fix(NotificationCenter): lock order inversion deadlock #5159 Jan 10, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 9 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@matejk matejk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments in the code. In addition: Would IOLock also deserve its own unit tests?

//
// Definition of the IOLock class.
//
// Copyright (c) 2004-2008, Applied Informatics Software Engineering GmbH.,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong (C) year. Blast from the past! :)

RWLock::ScopedWriteLock lock(_mutex);
_observers.emplace_back(observer.clone());
_observers.back()->start();
AbstractObserverPtr pObserver = observer.clone();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the observer have to be cloned before lock?

Copy link
Copy Markdown
Member Author

@aleks-f aleks-f Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because this is precisely where the deadlock potential resides:

pObserver->start();

namespace Poco {


class Foundation_API IOLock
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps IOResourcePin or IOResourceHold to distinguish from mutex-like locking?

void IOLock::wait()
{
while (_inProgress.load())
std::this_thread::sleep_for(std::chrono::microseconds(10));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use std::condition_variable instead of busy loop?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2ec60b0
let's see how that does in ci ...


bool IOLock::enter()
{
_inProgress.store(true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add detection if the second thread called enter()? debug assertion or exception?

Copy link
Copy Markdown
Member Author

@aleks-f aleks-f Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IOLock explicitly does not support multi-threaded I/O. It can be done, but I don't think the complications it brings are worth it, the class does one thing only - syncing between destruction in one thread and I/O in another. Whoever is doing l/O from multiple threads, I wish him good luck.

EDIT: to be more specific, whoever is doing I/O on a single channel, or polls, from multiple threads

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant was to detect multiple threads using IOLock and prevent that (or log or throw and exception).

Copy link
Copy Markdown
Member Author

@aleks-f aleks-f Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess one thing that could be done is to keep thread references and add some sanity checks, but I would do it for debug build only and I really don't have time to do it. As far as I'm concerned, we made a knife, if someone gets cut, thats' on them.

@matejk
Copy link
Copy Markdown
Contributor

matejk commented Jan 13, 2026

@aleks-f , ready to merge?

@matejk matejk merged commit 24b1fc1 into main Jan 15, 2026
100 checks passed
@matejk matejk deleted the 5159-nc-deadlock branch January 15, 2026 20:37
@matejk matejk mentioned this pull request Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lock-order-inversion deadlock in NotificationCenter

3 participants