Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyEval_GetFrame crashes when it returns incomplete frames #96975

Open
pablogsal opened this issue Sep 20, 2022 · 18 comments
Open

PyEval_GetFrame crashes when it returns incomplete frames #96975

pablogsal opened this issue Sep 20, 2022 · 18 comments

Comments

@pablogsal
Copy link
Member

pablogsal commented Sep 20, 2022

Is it possible that PyEval_GetFrame returns incomplete frames. When this happens, Python segfaults in release mode or crashes in debug mode because of the check for incomplete frames in _PyFrame_GetFrameObject. This is reproducible in many ways involving C extensions, the easier one may be by installing a custom memory allocator for all the domains and calling PyEval_GetFrame from there. As the allocator can be called at random points of the eval loop (for example here) is perfectly possible that a legitimate call to PyEval_GetFrame returns an incomplete frame.

@pablogsal
Copy link
Member Author

pablogsal commented Sep 20, 2022

@brandtbucher
Copy link
Member

brandtbucher commented Sep 20, 2022

Typo? Are you asking us if it's possible, or telling us it is?

@pablogsal
Copy link
Member Author

pablogsal commented Sep 20, 2022

I am telling you

@pablogsal pablogsal changed the title PyEval_GetFrame crashes if it returns incomplete frames PyEval_GetFrame crashes when it returns incomplete frames Sep 20, 2022
@pablogsal
Copy link
Member Author

pablogsal commented Sep 20, 2022

It may be easier to reproduce by using generators.

@brandtbucher
Copy link
Member

brandtbucher commented Sep 20, 2022

It seems to me that the correct thing to do in this case is to walk back over the frames until we reach a "complete" one, like we do in other places. Not 100% sure yet.

@brandtbucher
Copy link
Member

brandtbucher commented Sep 20, 2022

It may also be possible in sys._getframe(). The logic for skipping incomplete frames doesn't seem to happen anymore once you've walked back to the desired depth.

@brandtbucher
Copy link
Member

brandtbucher commented Sep 20, 2022

Ah wait, I think sys._getframe() is okay actually. I don't think it can be called from an incomplete frame.

@godlygeek
Copy link
Contributor

godlygeek commented Sep 21, 2022

It is possible to trigger this assertion using the C API.

In native_ext.c :
#define PY_SSIZE_T_CLEAN
#include <Python.h>

static PyMemAllocatorEx orig_allocator;
static _Thread_local int reentrant;

static void* malloc_hook(void *ctx, size_t size) {
    if (!reentrant) {
        reentrant = 1;
        PyEval_GetFrame();
        reentrant = 0;
    }
    return orig_allocator.malloc(ctx, size);
}

static PyMethodDef methods[] = {
        {NULL, NULL, 0, NULL},
};

static struct PyModuleDef moduledef = {PyModuleDef_HEAD_INIT, "native_ext", "", -1, methods};

PyMODINIT_FUNC
PyInit_native_ext(void)
{
    PyMem_GetAllocator(PYMEM_DOMAIN_OBJ, &orig_allocator);
    PyMemAllocatorEx alloc = orig_allocator;
    alloc.malloc = &malloc_hook;
    PyMem_SetAllocator(PYMEM_DOMAIN_OBJ, &alloc);
    return PyModule_Create(&moduledef);
}
In setup.py :
from distutils.core import Extension
from distutils.core import setup

setup(
    name="native_ext",
    version="0.0",
    ext_modules=[
        Extension(
            "native_ext",
            language="c",
            sources=["native_ext.c"],
            extra_compile_args=["-O0", "-g3"],
        ),
    ],
    zip_safe=False,
)

Installing the module into a python3.11-dbg venv and running the interpreter gives:

$ (.venv) 311_pyeval_getfram..>python3.11-dbg -c "import native_ext, json"
python3.11-dbg: /tmp/python3.11-3.11.0~rc2-0/Include/internal/pycore_frame.h:163: _PyFrame_GetFrameObject: Assertion `!_PyFrame_IsIncomplete(frame)' failed.

@brandtbucher
Copy link
Member

brandtbucher commented Sep 21, 2022

Thanks. I'll have to pick this back up tomorrow morning (might find some time tonight).

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented Sep 21, 2022

It is possible to trigger this assertion using the C API.

Thanks for the reproducer, which concisely demonstrates the bug. That's super helpful. There's a possible caveat, but only relating to the ways (and likelihood) the bug could be triggered in the wild.

FWIW, my understanding is that the CPython runtime expects the allocator to stay the same from initialization to finalization, so PyMem_SetAllocator() should be used only by embedders (before Py_Initialize()), not in extension modules. I may be wrong though. The docs don't mention this, nor (obviously) does the function check Py_IsInitialized() or _Py_IsCoreInitialized().

@vstinner would probably know for sure.

UPDATE: There is no caveat. Extensions can use PyMem_SetAllocator() as long as they only wrap the original allocator. See #96975 (comment).

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented Sep 21, 2022

Looking at the code, shouldn't we just call PyThreadState_GetFrame() in PyEval_GetFrame()?

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented Sep 21, 2022

FYI, this was fixed for signal handlers a week ago in #96776.

The assertion was added by Mark in July in #94371.

All the other places that call _PyFrame_GetFrameObject() (where the assert actually lives) seem to do the right thing. It looks like PyEval_GetFrame() just got missed (like happened with the signal module).

@vstinner
Copy link
Member

vstinner commented Sep 21, 2022

Python/initconfig.c calls _PyMem_SetDefaultAllocator() to force the usage of a specific memory allocator in function called before Python initialization (alloc) and during Python finalization (free): pymain_free() of Modules/main.c, called by Py_RunMain().

The Python memory allocator must be configured by the Python pre-initialization, or just after the pre-initialization, and before the Python initialization.

Setting a hook on existing allocators, like what tracemalloc does, is different and safe while Python is running.

@vstinner
Copy link
Member

vstinner commented Sep 21, 2022

the CPython runtime expects the allocator to stay the same from initialization to finalization

sorry, short reply: yes :-)

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented Sep 21, 2022

All the other places that call _PyFrame_GetFrameObject() (where the assert actually lives) seem to do the right thing.

I didn't look very closely at call_trace() or maybe_call_line_trace() (in ceval.c) though.

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented Sep 21, 2022

Setting a hook on existing allocators, like what tracemalloc does, is different and safe while Python is running.

The reproducer from #96975 (comment) does exactly that (simply wrapping the already-set allocator) so it's a legitimate example.

@brandtbucher
Copy link
Member

brandtbucher commented Sep 21, 2022

I didn't look very closely at call_trace() or maybe_call_line_trace() (in ceval.c) though.

They look safe, since tracing doesn't happen if we're in an incomplete state. The definition of _PyFrame_IsIncomplete is basically "can tracing happen yet?".

@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Sep 21, 2022

Looking at the code, shouldn't we just call PyThreadState_GetFrame() in PyEval_GetFrame()?

That indeed fixes the issue and avoids code duplication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

No branches or pull requests

7 participants