classification
Title: test.support has way too many imports
Type: Stage: patch review
Components: Tests Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: miss-islington, serhiy.storchaka, shihai1991, vinay.sajip, vstinner
Priority: normal Keywords: patch

Created on 2020-04-13 22:24 by vstinner, last changed 2020-05-20 16:25 by shihai1991.

Pull Requests
URL Status Linked Edit
PR 19592 closed shihai1991, 2020-04-19 08:21
PR 19599 closed shihai1991, 2020-04-19 10:41
PR 19600 merged serhiy.storchaka, 2020-04-19 11:20
PR 19601 merged serhiy.storchaka, 2020-04-19 11:23
PR 19603 merged serhiy.storchaka, 2020-04-19 13:35
PR 19711 merged serhiy.storchaka, 2020-04-25 09:18
PR 19716 merged shihai1991, 2020-04-26 17:24
PR 19761 merged shihai1991, 2020-04-29 01:04
PR 19825 merged vstinner, 2020-05-01 00:08
PR 19905 merged shihai1991, 2020-05-04 16:41
PR 20128 merged shihai1991, 2020-05-16 09:42
PR 20131 merged shihai1991, 2020-05-16 17:24
PR 20207 open shihai1991, 2020-05-19 05:18
PR 20263 open shihai1991, 2020-05-20 16:25
Messages (29)
msg366337 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-13 22:24
test_threading.ThreadJoinOnShutdown.test_reinit_tls_after_fork() does crash on AIX: bpo-40068. One of the issue is that logging does crash at fork: bpo-40091. The strange thing is that test_threading doesn't use logging. I'm quite sure that logging is imported through test.support.

"import test.support" imports not less than 171... That's quite "heavy". It includes "heavy" modules like concurrent.futures, asyncio or multiprocessing.

It's maybe time to slice again test.support until submodules to reduce the number of imports to the bare minimum. For example, "import test.support" imports asyncio, whereas it's only used by very few tests. Maybe we should add "test.support.asyncioutils".

$ ./python
Python 3.9.0a5+ (heads/pycore_interp:a1ff2c5cf3, Apr 13 2020, 12:27:13) 
>>> import sys
>>> import test

>>> before=set(sys.modules)
>>> import test.support
>>> after=set(sys.modules)

>>> len(after - before)
171

>>> import pprint
>>> pprint.pprint(sorted(after - before))
['__mp_main__',
 '_asyncio',
 '_bisect',
 '_blake2',
 '_bz2',
 '_collections',
 '_compat_pickle',
 '_compression',
 '_contextvars',
 '_datetime',
 '_elementtree',
 '_functools',
 '_hashlib',
 '_heapq',
 '_lzma',
 '_opcode',
 '_operator',
 '_pickle',
 '_posixsubprocess',
 '_queue',
 '_random',
 '_sha3',
 '_sha512',
 '_socket',
 '_sre',
 '_ssl',
 '_string',
 '_struct',
 '_sysconfigdata_d_linux_x86_64-linux-gnu',
 '_weakrefset',
 'argparse',
 'array',
 'asyncio',
 'asyncio.base_events',
 'asyncio.base_futures',
 'asyncio.base_subprocess',
 'asyncio.base_tasks',
 'asyncio.constants',
 'asyncio.coroutines',
 'asyncio.events',
 'asyncio.exceptions',
 'asyncio.format_helpers',
 'asyncio.futures',
 'asyncio.locks',
 'asyncio.log',
 'asyncio.protocols',
 'asyncio.queues',
 'asyncio.runners',
 'asyncio.selector_events',
 'asyncio.sslproto',
 'asyncio.staggered',
 'asyncio.streams',
 'asyncio.subprocess',
 'asyncio.tasks',
 'asyncio.transports',
 'asyncio.trsock',
 'asyncio.unix_events',
 'base64',
 'binascii',
 'bisect',
 'bz2',
 'collections',
 'collections.abc',
 'concurrent',
 'concurrent.futures',
 'concurrent.futures._base',
 'contextlib',
 'contextvars',
 'copy',
 'copyreg',
 'datetime',
 'difflib',
 'dis',
 'email',
 'email.base64mime',
 'email.charset',
 'email.encoders',
 'email.errors',
 'email.header',
 'email.quoprimime',
 'enum',
 'errno',
 'faulthandler',
 'fnmatch',
 'functools',
 'gc',
 'gettext',
 'glob',
 'grp',
 'gzip',
 'hashlib',
 'heapq',
 'importlib',
 'importlib._bootstrap',
 'importlib._bootstrap_external',
 'importlib.abc',
 'importlib.machinery',
 'importlib.util',
 'inspect',
 'itertools',
 'keyword',
 'linecache',
 'locale',
 'logging',
 'logging.handlers',
 'lzma',
 'math',
 'multiprocessing',
 'multiprocessing.context',
 'multiprocessing.process',
 'multiprocessing.reduction',
 'nntplib',
 'opcode',
 'operator',
 'pickle',
 'platform',
 'pprint',
 'pwd',
 'pyexpat',
 'pyexpat.errors',
 'pyexpat.model',
 'queue',
 'quopri',
 'random',
 're',
 'reprlib',
 'resource',
 'select',
 'selectors',
 'shutil',
 'signal',
 'socket',
 'sre_compile',
 'sre_constants',
 'sre_parse',
 'ssl',
 'string',
 'struct',
 'subprocess',
 'sysconfig',
 'tempfile',
 'test.support',
 'test.support.testresult',
 'threading',
 'token',
 'tokenize',
 'traceback',
 'types',
 'typing',
 'typing.io',
 'typing.re',
 'unittest',
 'unittest.async_case',
 'unittest.case',
 'unittest.loader',
 'unittest.main',
 'unittest.result',
 'unittest.runner',
 'unittest.signals',
 'unittest.suite',
 'unittest.util',
 'urllib',
 'urllib.error',
 'urllib.response',
 'warnings',
 'weakref',
 'xml',
 'xml.etree',
 'xml.etree.ElementPath',
 'xml.etree.ElementTree',
 'zlib']
msg366765 - (view) Author: hai shi (shihai1991) * Date: 2020-04-19 08:30
> "import test.support" imports not less than 171... That's quite "heavy".

If we split some "heavy" modules to xxxutils, what benefits  will we make in fact(more exact module importing behavior)?
msg366766 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-19 08:40
+1 for splitting test.support on several submodules!

Some imports can also be lazy.
msg366767 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-19 08:57
asyncio is now imported in unittest. Removing direct import from test.support does not help.
msg366768 - (view) Author: hai shi (shihai1991) * Date: 2020-04-19 09:12
> asyncio is now imported in unittest. Removing direct import from test.support does not help.

Oh, thanks, you are right. Looks like we need check which submodules should be splitted.
msg366776 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-19 11:13
logging is also imported in unittest and indirectly in asyncio.
msg366780 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-19 14:01
These three PRs combined reduce the number of imported modules from 179 to 105 (not including 24 modules imported by the bare interpreter) and reduce the hot import time from 160 ms to 80 ms.
msg366786 - (view) Author: hai shi (shihai1991) * Date: 2020-04-19 15:40
> These three PRs combined reduce the number of imported modules from 179 to 105 (not including 24 modules imported by the bare interpreter) and reduce the hot import time from 160 ms to 80 ms.

Good job, serhiy. I tested your patches in my vm, the number of importing module have been reduced. Could you paste your performance benchmark test? I got a no siginificant change :( when I run: 
$ ./python -m pyperf timeit --compare-to python3.9 "from test import support" -p 100
python3.9: ..................................................................................................... 820 ns +- 206 ns

python: ..................................................................................................... 809 ns +- 201 ns

Mean +- std dev: [python3.9] 820 ns +- 206 ns -> [python] 809 ns +- 201 ns: 1.01x faster (-1%)
Not significant!
msg366787 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-19 15:49
I used -X importtime.

Your benchmark measures the performance of look up in sys.modules.
msg366789 - (view) Author: hai shi (shihai1991) * Date: 2020-04-19 16:25
> I used -X importtime.
copy that, thanks.

In master branch:
import time:      5011 |     133562 | test.support

After apply serhiy's patches:
import time:      4863 |      66940 | test.support
msg366810 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2020-04-20 05:44
What is the practical impact on the time taken for test runs? How much does the total time for a test run reduce as a result of refactoring test.support to minimise imports?
msg366815 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-20 09:24
Reducing the import time is not a goal, it is just a measurable side effect. The goal is to reduce the number of imported modules unneeded for the particular test. Importing every module can have side effects which affects the tested behavior. It would be nice to minimize it.
msg366823 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-20 13:06
Vinay Sajip: "What is the practical impact on the time taken for test runs?"

My first concern is that our "unit tests" are not "unit" anymore. test_threading should be restricted to test the threading module: it should not test "indirectly" the logging module.

If someone wants to test how logging behaves on fork, test_logging should get a new test. (I didn't check if it already has such test.)
msg367261 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-25 07:04
New changeset 3c8a5b459d68b4337776ada1e04f5b33f90a2275 by Serhiy Storchaka in branch 'master':
bpo-40275: Avoid importing asyncio in test.support (GH-19600)
https://github.com/python/cpython/commit/3c8a5b459d68b4337776ada1e04f5b33f90a2275
msg367262 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-25 07:06
New changeset 16994912c93e8e5db7365d48b75d67d3f70dd7b2 by Serhiy Storchaka in branch 'master':
bpo-40275: Avoid importing socket in test.support (GH-19603)
https://github.com/python/cpython/commit/16994912c93e8e5db7365d48b75d67d3f70dd7b2
msg367265 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-25 08:35
New changeset 515fce4fc4bb0d2db97b17df275cf90640017f56 by Serhiy Storchaka in branch 'master':
bpo-40275: Avoid importing logging in test.support (GH-19601)
https://github.com/python/cpython/commit/515fce4fc4bb0d2db97b17df275cf90640017f56
msg367603 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-29 01:11
New changeset 66abe98a816de84f89e2de4aa78cf09056227c25 by Hai Shi in branch 'master':
bpo-40275: Move requires_hashdigest() to test.support.hashlib_helper (GH-19716)
https://github.com/python/cpython/commit/66abe98a816de84f89e2de4aa78cf09056227c25
msg367629 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-29 07:36
New changeset bfb1cf44658934cbcd9707fb717d6770c78fbeb3 by Serhiy Storchaka in branch 'master':
bpo-40275: Move transient_internet from test.support to socket_helper (GH-19711)
https://github.com/python/cpython/commit/bfb1cf44658934cbcd9707fb717d6770c78fbeb3
msg367818 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-05-01 00:35
New changeset 2935e65c36fab1989bbda19db72c035ea22b044b by Victor Stinner in branch 'master':
bpo-40275: Fix name error in support.socket_helper (GH-19825)
https://github.com/python/cpython/commit/2935e65c36fab1989bbda19db72c035ea22b044b
msg368073 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-05-04 18:05
New changeset 975408c065b645e7d717546b0d744415abb45cd1 by Hai Shi in branch 'master':
bpo-40275: test.support imports lazily locale import (GH-19761)
https://github.com/python/cpython/commit/975408c065b645e7d717546b0d744415abb45cd1
msg368814 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-05-14 01:22
New changeset 7443d42021d433da0497f8ba651daa47e7dc1991 by Hai Shi in branch 'master':
bpo-40275: Import locale module lazily in gettext (GH-19905)
https://github.com/python/cpython/commit/7443d42021d433da0497f8ba651daa47e7dc1991
msg368816 - (view) Author: hai shi (shihai1991) * Date: 2020-05-14 04:37
Hi, folks.
If there have no other imports should be improved, I suggest to close this bpo ;)
msg368840 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-05-14 14:16
While the number of imports reduced a lot, "import test.support" still imports 98 modules. That's a lot! IMO we can reduce it a little bit more :-)

Examples of imports which could be lazy:

* faulthandler: move it into start_threads() and disable_faulthandler()
* shutil: move it into rmtree()
* bz2: move it into @requires_bz2 decorator. Is it possible? Same for zlib, gzip and lzma?


There are also tons of functions, it may be time to create even more submodules:

* support.threading_helper:

  * threading_setup()/threading_cleanup()
  * reap_threads()
  * wait_thread_exit()
  * join_thread()
  * catch_threading_exception context manager
  * start_threads()

* support.process_helper:
  
  * wait_process()
  * reap_children()
  * SuppressCrashReport
  * PIPE_MAX_SIZE (is it the best place for that one?)
  * args_from_interpreter_flags()
  * optim_args_from_interpreter_flags()
  * PythonSymlink

* support.file_helper: 

  * rmtree() and the related private functions
  * unlink()
  * rmdir()
  * FS_NONASCII
  * TESTFN_UNICODE, TESTFN_NONASCII, TESTFN_UNENCODABLE, TESTFN_UNDECODABLE
  * SAVEDCWD (not sure about this one)
  * TESTFN_ENCODING <= this one can be removed, it's just sys.getfilesystemencoding()
  * temp_dir()
  * change_cwd()
  * temp_cwd()
  * temp_umask()
  * create_empty_file()
  * make_bad_fd()
  * EnvironmentVarGuard (not sure about this one)
  * can_symlink(), skip_unless_symlink()
  * can_xattr(), skip_unless_xattr()
  * fs_is_case_insensitive()
  * fd_count()
  * FakePath

* support.import_helper:

  * import_module(), _ignore_deprecated_imports()
  * import_fresh_module(), _save_and_remove_module(), _save_and_block_module()
  * unload()
  * make_legacy_pyc()
  * forget()
  * CleanImport
  * DirsOnSysPath
  * modules_setup(), modules_cleanup()

* support.warnings_helper:

  * check_syntax_warning()
  * ignore_warnings()
  * WarningsRecorder
  * check_warnings(), _filterwarnings()
  * check_no_warnings()
  * check_no_resource_warning()

Move also move the following functions to socket_helper:

  * SOCK_MAX_SIZE
  * open_urlresource()
  * TransientResource
  * time_out
  * socket_peer_reset
  * ioerror_peer_reset

Serhiy: What do you think of these proposed submodules?
msg369029 - (view) Author: hai shi (shihai1991) * Date: 2020-05-16 09:50
OK, I continue to reduce the import module in test.supports.
msg369030 - (view) Author: hai shi (shihai1991) * Date: 2020-05-16 09:53
After PR20128, the import modules from 132 to 123.
msg369032 - (view) Author: miss-islington (miss-islington) Date: 2020-05-16 10:01
New changeset 372fa3ead584876a975a61936b376259be636d27 by Hai Shi in branch 'master':
bpo-40275: lazy import modules in test.support (GH-20128)
https://github.com/python/cpython/commit/372fa3ead584876a975a61936b376259be636d27
msg369062 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-05-16 19:29
I think that there is no much benefit in avoiding to import modules which are imported in libregrtest. In particular threading, subprocess, tempdir, os, fnmatch, etc. You should minimize import of test.libregrtest + test.support, not just test.support. BTW, libregrtests imports now much more modules than test.support.

Also, some modules, like bz2 are too small and do not have much dependencies. You will not save much on importing them lazily. On other hand, lazy import have its cost, so the real benefit will be even smaller.
msg369214 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-05-18 13:22
Serhiy:
> I think that there is no much benefit in avoiding to import modules which are imported in libregrtest.

Well, we should also enhance libregrtest in this case :-)

> You should minimize import of test.libregrtest + test.support, not just test.support.

Good idea.

> Also, some modules, like bz2 are too small and do not have much dependencies. You will not save much on importing them lazily. On other hand, lazy import have its cost, so the real benefit will be even smaller.

FYI bz2 and lzma can me some headaches when experimenting isolated subinterpreters while I was testing modules which don't use them. bz2 and lzma are not compatible with subintepreters and so caused crashes.

IMO in an ideal world, running test_xxx should start with an empty sys.modules and only imports what it needs. The problem are also side effects, not only memory footprint.

I created this issue because test_threading crashed on AIX because of a bug at fork in the logging module. Except that test_threading and threading modules *don't* importing logging. It's non obvious to get a crash in logging while testing the threading module. For me, these are tests are no longer "unit" tests, but more "integration" tests if we test indirectly "half of the stdlib" (I exaggerate on purpose ;-)).
msg369296 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-05-18 22:03
New changeset a3ec3ad9e20e7d9ed148d4cfbd22aebec608b42a by Hai Shi in branch 'master':
bpo-40275: More lazy imports in test.support (GH-20131)
https://github.com/python/cpython/commit/a3ec3ad9e20e7d9ed148d4cfbd22aebec608b42a
History
Date User Action Args
2020-05-20 16:25:05shihai1991setpull_requests: + pull_request19549
2020-05-19 05:18:48shihai1991setpull_requests: + pull_request19504
2020-05-18 22:03:04vstinnersetmessages: + msg369296
2020-05-18 13:22:08vstinnersetmessages: + msg369214
2020-05-16 19:29:56serhiy.storchakasetmessages: + msg369062
2020-05-16 17:24:11shihai1991setpull_requests: + pull_request19436
2020-05-16 10:01:46miss-islingtonsetnosy: + miss-islington
messages: + msg369032
2020-05-16 09:53:58shihai1991setmessages: + msg369030
2020-05-16 09:50:03shihai1991setmessages: + msg369029
2020-05-16 09:42:39shihai1991setpull_requests: + pull_request19433
2020-05-14 14:16:05vstinnersetmessages: + msg368840
2020-05-14 04:37:26shihai1991setmessages: + msg368816
2020-05-14 01:22:37vstinnersetmessages: + msg368814
2020-05-04 18:05:09vstinnersetmessages: + msg368073
2020-05-04 16:41:16shihai1991setpull_requests: + pull_request19220
2020-05-01 00:35:28vstinnersetmessages: + msg367818
2020-05-01 00:08:17vstinnersetpull_requests: + pull_request19146
2020-04-29 07:36:26serhiy.storchakasetmessages: + msg367629
2020-04-29 01:11:36vstinnersetmessages: + msg367603
2020-04-29 01:04:43shihai1991setpull_requests: + pull_request19102
2020-04-26 17:24:56shihai1991setpull_requests: + pull_request19041
2020-04-25 09:18:39serhiy.storchakasetpull_requests: + pull_request19033
2020-04-25 08:35:26serhiy.storchakasetmessages: + msg367265
2020-04-25 07:06:33serhiy.storchakasetmessages: + msg367262
2020-04-25 07:04:14serhiy.storchakasetmessages: + msg367261
2020-04-20 13:06:16vstinnersetmessages: + msg366823
2020-04-20 09:24:00serhiy.storchakasetmessages: + msg366815
2020-04-20 05:44:23vinay.sajipsetnosy: + vinay.sajip
messages: + msg366810
2020-04-19 16:25:28shihai1991setmessages: + msg366789
2020-04-19 15:49:55serhiy.storchakasetmessages: + msg366787
2020-04-19 15:40:14shihai1991setmessages: + msg366786
2020-04-19 14:01:56serhiy.storchakasetmessages: + msg366780
2020-04-19 13:35:05serhiy.storchakasetpull_requests: + pull_request18939
2020-04-19 11:23:34serhiy.storchakasetpull_requests: + pull_request18937
2020-04-19 11:20:36serhiy.storchakasetpull_requests: + pull_request18936
2020-04-19 11:13:43serhiy.storchakasetmessages: + msg366776
2020-04-19 10:41:04shihai1991setpull_requests: + pull_request18935
2020-04-19 09:12:36shihai1991setmessages: + msg366768
2020-04-19 08:57:17serhiy.storchakasetmessages: + msg366767
2020-04-19 08:40:07serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg366766
2020-04-19 08:30:31shihai1991setmessages: + msg366765
2020-04-19 08:21:49shihai1991setkeywords: + patch
nosy: + shihai1991

pull_requests: + pull_request18929
stage: patch review
2020-04-13 22:24:51vstinnercreate