Skip to content

bpo-38762: Extend logging.test_multiprocessing to cover missing cases.#22142

Merged
vsajip merged 8 commits intopython:masterfrom
iritkatriel:logging-processName-pid
Sep 8, 2020
Merged

bpo-38762: Extend logging.test_multiprocessing to cover missing cases.#22142
vsajip merged 8 commits intopython:masterfrom
iritkatriel:logging-processName-pid

Conversation

@iritkatriel
Copy link
Copy Markdown
Member

@iritkatriel iritkatriel commented Sep 7, 2020

In PR 22063 we have a solution that always imports multiprocessing (unless the user explicitly sets logMultiprocessing to False). It has the disadvantage that multiprocessing is imported even if the application only ever has one process.

This PR has an alternative solution which calculates processName on a best-effort basis, so that if multiprocessing is not imported, processName is None. Note that in the case the process field of the record has the process ID (so long as logProcesses is True and os has a getpid() function).

So the change is:

  1. For single process applications, the processName is None instead of "MainProcess", which probably doesn't matter much because there is only one process.

  2. In the edge case where there are multiple processes but multiprocessing was somehow removed from sys.modules, the processName will be None instead of "MainProcess", which is better because the latter may be incorrect.

https://bugs.python.org/issue38762

…me by removing the flaky assumption that if multiprocessing is not in sys.modules then it must be the main process.
Comment thread Lib/logging/__init__.py Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a behaviour change? As logMultiprocessing defaults to True, in the common case where one doesn't change it and doesn't use multiprocessing, processName would be 'MainProcess' in LogRecords, before this change. After the change, it will be None. What's the benefit of this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. Our choices currently are:

(1) leave it as it is, then sometimes (but not often) "MainProcess" is incorrect
(2) do this and then MainProcess is replaced by None, which is at least not incorrect
(3) always import multiprocessing and then process name is always correct

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • (3) is bad because we do an import which will not be necessary in the large majority of cases (ones that don't use multiprocessing). It's probably not cheap to import, though I haven't done any actual measurements (I try to err on the side of minimizing logging processing, because many people are sensitive to logging overhead and I don't feel this import would be justified for the vast majority of cases).
  • As for (1) and (2), let's look at when processName would be incorrect:
    1. In a process that isn't the main process, and
    2. Although it was invoked by multiprocessing (as not the main process), sys.modules would have contained at some point, but then somehow lost, the entry for multiprocessing. There might be reasons for this:
      • The user program deliberately does a del sys.modules['multiprocessing'], or does something else to sys.modules which causes that entry to be removed. This is a very uncommon use case (cited in the issue as a theoretical possibility, but not apparently supported by an actual real-world example). In this case, I think it is better for the developer of that program to handle that case explicitly, without forcing everyone to pay any re-import price in the normal use case where sys.modules doesn't lose stuff. This was my suggestion to the issue's OP.
      • The state of sys.modules changes during program shutdown. This does happen, but all sorts of things are winding down at that point and there is no point in trying to log stuff where even things like sys.stdout and sys.stderr might not be available. So I don't believe we have to handle this case, either.

In summary, I don't think anything should change in logging itself for this issue. However, that you added a test is valuable, and I thank for you that and the thought and time you have put into this! I think the best thing to do is close that other PR, revert the change in logging and keep the test, and as long as it passes, I'll approve and merge this.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes a lot of sense. I'll update this PR for the test and close the other one. I think we needed to look at all options to see that none of them are better than the current state.

@iritkatriel iritkatriel changed the title bpo-38762: Fix edge case where logging records an incorrect processNa… bpo-38762: Extend test_multiprocessing to cover missing cases. Sep 8, 2020
@iritkatriel iritkatriel force-pushed the logging-processName-pid branch from ffb7da5 to 8fc5a88 Compare September 8, 2020 15:45
@serhiy-storchaka serhiy-storchaka added 🔨 test-with-buildbots Test PR w/ buildbots; report in status section skip news labels Sep 8, 2020
@bedevere-bot
Copy link
Copy Markdown

🤖 New build scheduled with the buildbot fleet by @serhiy-storchaka for commit 192d6bc 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Sep 8, 2020
@serhiy-storchaka serhiy-storchaka added the tests Tests in the Lib/test dir label Sep 8, 2020
@iritkatriel iritkatriel changed the title bpo-38762: Extend test_multiprocessing to cover missing cases. bpo-38762: Extend logging.test_multiprocessing to cover missing cases. Sep 8, 2020
@vsajip vsajip merged commit 3fd6999 into python:master Sep 8, 2020
@iritkatriel iritkatriel deleted the logging-processName-pid branch November 2, 2020 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip news tests Tests in the Lib/test dir

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants