bpo-38762: Extend logging.test_multiprocessing to cover missing cases.#22142
bpo-38762: Extend logging.test_multiprocessing to cover missing cases.#22142vsajip merged 8 commits intopython:masterfrom
Conversation
…me by removing the flaky assumption that if multiprocessing is not in sys.modules then it must be the main process.
There was a problem hiding this comment.
Isn't this a behaviour change? As logMultiprocessing defaults to True, in the common case where one doesn't change it and doesn't use multiprocessing, processName would be 'MainProcess' in LogRecords, before this change. After the change, it will be None. What's the benefit of this?
There was a problem hiding this comment.
That's right. Our choices currently are:
(1) leave it as it is, then sometimes (but not often) "MainProcess" is incorrect
(2) do this and then MainProcess is replaced by None, which is at least not incorrect
(3) always import multiprocessing and then process name is always correct
There was a problem hiding this comment.
- (3) is bad because we do an import which will not be necessary in the large majority of cases (ones that don't use
multiprocessing). It's probably not cheap to import, though I haven't done any actual measurements (I try to err on the side of minimizing logging processing, because many people are sensitive to logging overhead and I don't feel this import would be justified for the vast majority of cases). - As for (1) and (2), let's look at when
processNamewould be incorrect:- In a process that isn't the main process, and
- Although it was invoked by
multiprocessing(as not the main process),sys.moduleswould have contained at some point, but then somehow lost, the entry formultiprocessing. There might be reasons for this:- The user program deliberately does a
del sys.modules['multiprocessing'], or does something else tosys.moduleswhich causes that entry to be removed. This is a very uncommon use case (cited in the issue as a theoretical possibility, but not apparently supported by an actual real-world example). In this case, I think it is better for the developer of that program to handle that case explicitly, without forcing everyone to pay any re-import price in the normal use case wheresys.modulesdoesn't lose stuff. This was my suggestion to the issue's OP. - The state of
sys.moduleschanges during program shutdown. This does happen, but all sorts of things are winding down at that point and there is no point in trying to log stuff where even things likesys.stdoutandsys.stderrmight not be available. So I don't believe we have to handle this case, either.
- The user program deliberately does a
In summary, I don't think anything should change in logging itself for this issue. However, that you added a test is valuable, and I thank for you that and the thought and time you have put into this! I think the best thing to do is close that other PR, revert the change in logging and keep the test, and as long as it passes, I'll approve and merge this.
There was a problem hiding this comment.
That makes a lot of sense. I'll update this PR for the test and close the other one. I think we needed to look at all options to see that none of them are better than the current state.
ffb7da5 to
8fc5a88
Compare
…the logging.logMultiprocessing=False case, which is covered by the next test.
|
🤖 New build scheduled with the buildbot fleet by @serhiy-storchaka for commit 192d6bc 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
In PR 22063 we have a solution that always imports multiprocessing (unless the user explicitly sets logMultiprocessing to False). It has the disadvantage that multiprocessing is imported even if the application only ever has one process.
This PR has an alternative solution which calculates processName on a best-effort basis, so that if multiprocessing is not imported, processName is None. Note that in the case the process field of the record has the process ID (so long as logProcesses is True and os has a getpid() function).
So the change is:
For single process applications, the processName is None instead of "MainProcess", which probably doesn't matter much because there is only one process.
In the edge case where there are multiple processes but multiprocessing was somehow removed from sys.modules, the processName will be None instead of "MainProcess", which is better because the latter may be incorrect.
https://bugs.python.org/issue38762