-
-
Notifications
You must be signed in to change notification settings - Fork 33.9k
bpo-33929: multiprocessing: fix handle leak on race condition #7921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@pitrou: My first version was wrong, but I fixed it and now tests pass on AppVeyor and VSTS (Windows). Would you mind to review this change please? I'm not 100% sure that it's correct, but I'm now sure that the current code has a race condition :-) |
Lib/multiprocessing/reduction.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is this function still used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's part of the public API, no? At least, it's exported in all on Windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this comment is useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first version of my patch tried to close sentinel as soon as the process completed. But it broke a test. Then I understood that it's public attribute, and so it cannot be closed here. For me it's non-obvious that close() cannot be closed "soon" (at least, not by Popen itself).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. The comment confuses me more than the situation it's supposed to comment :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/it/the read end of pipe/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/has been/had been/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"before it could steal the handle from the parent process"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Lib/multiprocessing/reduction.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit annoying. One argument is a process handle and the other is a pid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
steal_handle() also has a weird API, but I agree that it's not a good reason to continue that :-) I changed the parameter to source_process, and moved the pid => handle dance to the caller.
vstinner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rebased my PR to be able to rewrite the commit message and update properly the NEWS entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Lib/multiprocessing/reduction.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
steal_handle() also has a weird API, but I agree that it's not a good reason to continue that :-) I changed the parameter to source_process, and moved the pid => handle dance to the caller.
bpo-33929: Fix a race condition in Popen of multiprocessing.popen_spawn_win32. The child process now duplicates the read end of pipe instead of "stealing" it. Previously, the read end of pipe was "stolen" by the child process, but it leaked a handle if the child process had been terminated before it could steal the handle from the parent process.
|
@pitrou: I fixed all the issues that you reported. Would you mind to review my updated PR? Thanks. By the way, I found a similar but different bug: https://bugs.python.org/issue33966 But it's a different code path, so I think that it's fine to fix it in a different PR. |
Ok, fair. I removed the comment. |
|
Thank you for unconfusing me :-) |
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @vstinner !
LOL |
|
Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.6, 3.7. |
|
GH-7960 is a backport of this pull request to the 3.7 branch. |
…GH-7921) Fix a race condition in Popen of multiprocessing.popen_spawn_win32. The child process now duplicates the read end of pipe instead of "stealing" it. Previously, the read end of pipe was "stolen" by the child process, but it leaked a handle if the child process had been terminated before it could steal the handle from the parent process. (cherry picked from commit 2cc9d21) Co-authored-by: Victor Stinner <vstinner@redhat.com>
|
Sorry, @vstinner, I could not cleanly backport this to |
|
GH-7961 is a backport of this pull request to the 3.6 branch. |
|
Victor, I'm skeptical about backporting this. It's not an important issue and we risk introducing regressions in a bugfix release. Let's make it a 3.8-only change? |
I'm trying to understand why multiprocessing tests are failing randomly but also frequently, for one year (if not longer, 3 years?). Fixing race conditions should reduce random tests failures, but I'm not sure that leaking open Windows handles can explain these failures. About regression: IMHO the new code is less fragile and should not have no race condition. But I'm ok to not backport this change, except if someone proves that this change helps to make tests more stable :-) |
Fix a race condition in Popen of multiprocessing.popen_spawn_win32. The child process now duplicates the read end of pipe instead of "stealing" it. Previously, the read end of pipe was "stolen" by the child process, but it leaked a handle if the child process had been terminated before it could steal the handle from the parent process. (cherry picked from commit 2cc9d21) Co-authored-by: Victor Stinner <vstinner@redhat.com>
|
I rejected my 3.6 backport: PR 7961. But @miss-islington merged its 3.7 backport: PR #7960 :-( @pitrou: Should I revert the backport to 3.7? Maybe it's fine for the 3.7.1, since it's newer? |
Fix a race condition in Popen of
multiprocessing.popen_spawn_win32. The child process now duplicates
rhandle instead of "stealing" it.
Previously, rhandle was "stolen" by the child process, but it leaked
a handle if the child process has been terminated before it "stole"
the handle (and closed it in the parent process).
https://bugs.python.org/issue33929