-
-
Notifications
You must be signed in to change notification settings - Fork 33.9k
[WIP] bpo-36142: Rework Python init to add _PyPreConfig #12087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@ncoghlan @ericsnowcurrently @methane @zooba: Even if this PR doesn't fix all problems, it should make the Python initialization a little bit better :-) What do you think of the overall design? About the current PR, I'm not sure that it's a good idea to push such giant commit at all. I can split it into several changes, but it may be even more verbose since each change has to modify multiple files. |
I added a _PyCoreConfig structure to Python 3.7 which contains almost
all parameters used to configure Python. Problems: _PyCoreConfig uses
bytes and Unicode strings (char* and wchar_t*) whereas it is also
used to setup the memory allocator and (filesystem, locale and stdio)
encodings.
I propose to add a new _PyPreConfig which is the "strict minimum"
configuration to setup encodings and the memory allocator. In
practice, it also contains parameters which directly or indirectly
impacts the allocator and encodings. For example, isolated impacts
use_environment which impacts the allocator (PYTHONMALLOC environment
variable). Another example: dev_mode=1 sets the allocator to "debug".
The command line arguments are now parsed twice. _PyPreConfig only
parses a few parameters like -E, -I and -X. A temporary _PyPreCmdline
is used to store command line arguments like -X options.
I moved structures closer to where they are used. "Global" _PyMain
structure has been removed. _PyCmdline now lives way shorter than
previously and is moved from main.c to coreconfig.c. The idea is to
better control when and how memory is allocated.
In term of API, we get something like:
_PyCoreConfig config = _PyCoreConfig_INIT;
config.preconfig.stdio_encoding = "iso8859-1";
config.preconfig.stdio_errors = "replace";
config.user_site_directory = 0;
...
_PyInitError err = _Py_InitializeFromConfig(&config);
if (_Py_INIT_FAILED(err)) {
_Py_ExitInitError(err);
}
...
Py_Finalize();
return 0;
"config.preconfig.stdio_errors" syntax isn't great, but it's simpler
to implement than duplicating all _PyPreConfig fields into
_PyCoreConfig.
Changes:
* Add Python/preconfig.c file
* Rename Include/coreconfig.h to Include/cpython/coreconfig.h.
* Add Include/internal/pycore_coreconfig.h
* Add _PyPreConfig structure: pre-initialization configuration for
allocator and encodings.
* _PyCoreConfig now has a "_PyPreConfig preconfig;" field
* Add _PyArgv structure: helper to pass command line argument in
bytes (char*) or Unicode (wchar_t*)
* Add exitcode to _PyInitError
* Add 'exitcode' field to _PyInitError structure
* Add _Py_INIT_EXIT() macro
* Rename _Py_FatalInitError() to _Py_ExitInitError()
* Add new fields to _PyCoreConfig:
* run_command
* run_filename
* run_module
* skip_source_first_line
* Add _PyMain_Usage()
* _Py_InitializeCore() no longer changes the memory allocator.
* Hardcode short and long options in _PyOS_GetOpt().
* Make _PyCoreConfig_xxx() functions internal (Py_BUILD_CORE).
* Move _PyCmdline from main.c to coreconfig.c
* main.c: Remove _PyMain structure
|
I continued to work in the PR after I posted it to finish to cleanup the code. I succeeded to fix how and when the memory allocator is set. The drawback is that when Py_Initialize() is called before Py_Main(), PYTHONMALLOC is now ignored:
Technically, everything would be possible, like supporting this corner case. But it would make the code way more complex. It requires to change temporarily the memory allocator each time we allocate memory, to later be able to use the same allocator to release memory. Use such pattern: That's what I did in Python 3.7, but it's really ugly and risky. If any function isn't protected properly with this pattern, we use the wrong allocator and everything goes wrong (Python will likely crash). |
|
I'm now confident that a new "pre-initialization" step is required before _PyCoreConfig initialization (Py_Initialize). It makes the code more reliable and safer. |
|
This PR is just too big to be merged at once. Since I now found what I would like to do, I will split this PR into multiple smaller changes. |
Rename and move it to preconfig.c.
|
I started to rewrite this giant PR from scratch as simpler changes which are way easier to follow and understand. I close this PR. |
I added a _PyCoreConfig structure to Python 3.7 which contains almost
all parameters used to configure Python. Problems: _PyCoreConfig uses
bytes and Unicode strings (char* and wchar_t*) whereas it is also
used to setup the memory allocator and (filesystem, locale and stdio)
encodings.
I propose to add a new _PyPreConfig which is the "strict minimum"
configuration to setup encodings and the memory allocator. In
practice, it also contains parameters which directly or indirectly
impacts the allocator and encodings. For example, isolated impacts
use_environment which impacts the allocator (PYTHONMALLOC environment
variable). Another example: dev_mode=1 sets the allocator to "debug".
The command line arguments are now parsed twice. _PyPreConfig only
parses a few parameters like -E, -I and -X. A temporary _PyPreCmdline
is used to store command line arguments like -X options.
I moved structures closer to where they are used. "Global" _PyMain
structure has been removed. _PyCmdline now lives way shorter than
previously and is moved from main.c to coreconfig.c. The idea is to
better control when and how memory is allocated.
In term of API, we get something like:
"config.preconfig.stdio_errors" syntax isn't great, but it's simpler
to implement than duplicating all _PyPreConfig fields into
_PyCoreConfig.
Changes:
Add Python/preconfig.c file
Add _PyPreConfig structure: pre-initialization configuration for
allocator and encodings.
_PyCoreConfig now has a "_PyPreConfig preconfig;" field
Add _PyArgv structure: helper to pass command line argument in
bytes (char*) or Unicode (wchar_t*)
Add exitcode to _PyInitError
Add new fields to _PyCoreConfig:
Add _PyMain_Usage()
Hardcode short and long options in _PyOS_GetOpt().
Make _PyCoreConfig_xxx() functions internal (Py_BUILD_CORE).
Move _PyCmdline from main.c to coreconfig.c
main.c: Remove _PyMain structure
https://bugs.python.org/issue36142