Skip to content

Fix thread counter leak#3992

Merged
aleks-f merged 1 commit intopocoproject:develfrom
alex65536:fix-thread-counter-leak
Jun 10, 2023
Merged

Fix thread counter leak#3992
aleks-f merged 1 commit intopocoproject:develfrom
alex65536:fix-thread-counter-leak

Conversation

@alex65536
Copy link
Copy Markdown
Contributor

I noticed that Poco HTTP server becomes slow after some uptime, especially when the load is high. The reason is that there is a race condition when decrementing _currentThreads. Because of this, _currentThreads always becomes bigger than the current number of threads.

The fix was already proposed before in the Clickhouse fork, see ClickHouse#28. I have just cherry-picked it to be merged into upstream Poco.

Cherry-picked from ClickHouse#28, see the
mentioned PR for more details.
@aleks-f aleks-f added this to the Release 1.13.0 milestone May 5, 2023
@aleks-f aleks-f requested a review from obiltschnig May 5, 2023 14:35
@aleks-f
Copy link
Copy Markdown
Member

aleks-f commented May 5, 2023

@alex65536 wouldn't just adding check for _stopped fix the problem?

if (_pDisp->_stopped || (_pDisp->_currentThreads > 1 && _pDisp->_queue.empty()))

@alex65536
Copy link
Copy Markdown
Contributor Author

@alex65536 wouldn't just adding check for _stopped fix the problem?

I don't think so. The bug is that the server is slowly degrading, and the server doesn't need to be stopped in order to reproduce it. So I don't believe that just adding stopped_ will help.

@aleks-f
Copy link
Copy Markdown
Member

aleks-f commented May 10, 2023

@alex65536 wouldn't just adding check for _stopped fix the problem?

I don't think so. The bug is that the server is slowly degrading, and the server doesn't need to be stopped in order to reproduce it. So I don't believe that just adding stopped_ will help.

OK, but then the fix you submitted won't fix it either, because it's doing the same thing as the destructor would if modified as I suggested. If I'm missing something, please explain in detail, "slowly degrading" does not make sense to me; I admit I have scarce time to spend on this, but I do not want to again merge fix which is not fixing the problem.

@alex65536
Copy link
Copy Markdown
Contributor Author

alex65536 commented May 11, 2023

The difference is that both the current code and your proposed change perform the checks twice, and both checks don't happen under the same lock. This PR will perform the check only once.

I was not proposing to check it twice. I meant to (1) modify the check in the ~ThreadCountWatcher() and (2) remove the one at the bottom of TCPServerDispatcher::run() for loop. I just didn't explicitly stated the (2), but that's why I said it was the same thing. but I guess it doesn't really matter which way it is done since it is the same thing in the end.

@alex65536
Copy link
Copy Markdown
Contributor Author

Any progress on the review?

@aleks-f aleks-f merged commit c6fd0db into pocoproject:devel Jun 10, 2023
aleks-f pushed a commit that referenced this pull request Jun 10, 2023
Cherry-picked from ClickHouse#28, see the
mentioned PR for more details.

Co-authored-by: alexey-milovidov <milovidov@yandex-team.ru>
aleks-f pushed a commit that referenced this pull request Nov 27, 2023
Cherry-picked from ClickHouse#28, see the
mentioned PR for more details.

Co-authored-by: alexey-milovidov <milovidov@yandex-team.ru>
@Burgch
Copy link
Copy Markdown
Contributor

Burgch commented Dec 4, 2023

Just wanted to say thank you for this fix - it also fixes a race condition where the value of _currentThreads can be decremented excessively. It's basically the opposite of the race condition described in ClickHouse#28 - if the queue is empty when the ThreadCountWatcher is destructed, but then not empty when the condition is checked at the end of the for-loop, the thread will continue and construct a new ThreadCountWatcher on the next iteration. This means _currentThreads can be decremented more than once by any given thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants