You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a Vera program is interrupted (or possibly during normal execution — see updated analysis), the Python process aborts with a macOS malloc error inside wasmtime's host-function trampoline:
Python(NNN,0xN): malloc: *** error for object 0xN: pointer being freed was not allocated
Python(NNN,0xN): malloc: *** set a breakpoint in malloc_error_break to debug
Abort trap: 6
The user sees a macOS "Python quit unexpectedly" popup. The fix for the related Python KeyboardInterrupt traceback is shipped in v0.0.137 (host_sleep catches KeyboardInterrupt and raises _VeraExit(130) for clean exit). The malloc abort is a separate, lower-level issue that may persist even after the Python-traceback fix lands.
Updated diagnosis (with crash report stack trace)
The full macOS crash report points the abort at a very specific call site:
This rules out my earlier "cleanup-path" hypothesis: the abort happens insidewasmtime::runtime::func::HostFunc::array_call_trampoline (at offset +456), which is wasmtime's trampoline that wraps host imports. The trampoline:
Marshals call args from WASM ABI to Rust ABI
Invokes the host function (our Python callback via ctypes)
Marshals return values back / cleans up
The +456 offset places us AFTER the host callback returned (or threw), in the cleanup/return phase. Memory the trampoline allocated for the call is being freed, but the freed pointer wasn't malloc'd by the same allocator.
Combined with the 24 frames of unsymbolicated WASM code at the same offset (suggesting 24-deep recursion through run_loop), the crash signature is consistent with: the deep WASM recursion has corrupted some memory wasmtime depends on, and the corruption surfaces when the host trampoline tries to clean up after a host call.
The previous hypothesis listed three possibilities. The stack trace narrows it:
wasmtime-py callback teardown ordering — likely NOT the cleanup ordering itself; the abort is mid-trampoline, not at process exit.
Outstanding shadow-stack root not cleared — possible but doesn't directly explain malloc/free mismatch.
Native callback re-entrancy — possible but the trace shows a single call stack, not signal-handler reentry.
NEW (most likely): heap corruption from an in-progress codegen bug. The same codegen path that produces the U+FFFD-string corruption documented in #593 is plausibly also corrupting wasmtime-internal heap structures (e.g. the WASM linear memory could be overflowing into wasmtime's own allocator state, or a misaligned write to linear memory could clobber a metadata header that wasmtime later tries to free).
Both appear from "generation 1+" timing (deep into the recursive run_loop).
The 24-frame deep WASM stack at frames 7-27 corresponds to ~24 generations of run_loop recursion.
Possibly Python-3.14-related?
The user's Python is 3.14.3, released October 2025. Python 3.14 included significant ctypes refactoring. wasmtime-py may not yet be hardened against the new ctypes ABI behaviour. Worth testing the same reproducer under Python 3.13 to see if the abort still fires — if not, this is partially a wasmtime-py-vs-Python-3.14 ABI gap.
Reproducer
Run any Vera program that (a) recurses deeply with allocating arguments, (b) uses host imports (especially IO.sleep, IO.print), and (c) hits #593's heap-corruption trigger. The simplest:
vera run /Users/aa/Downloads/files/life_full_program.vera
# Wait through generations 0-50, then Ctrl-C OR let it run to completion
The malloc abort fires reliably once the Life program reaches the corruption window from #593.
DETERMINISTIC REPRODUCER (added 2026-05-07)
While testing the IO.sleepKeyboardInterrupt guard fix in PR #594, I temporarily reverted the guard and ran the e2e test to confirm it caught the regression. The test triggered an immediate SIGABRT matching this issue's stack trace exactly:
This is the SAME signature this issue documents — but reproducible via a 5-line setup, no Life program / 200 generations / manual Ctrl-C required.
Reproducer:
# Run with the production host_sleep guard REMOVED# (the guard at vera/codegen/api.py around line 1191).importtimeas_timefromunittest.mockimportpatchfromvera.codegenimportcompileascompile_program, executefromvera.parserimportparse_to_astsource='''public fn main(@Unit -> @Unit) requires(true) ensures(true) effects(<IO>){ IO.sleep(120)}'''result=compile_program(parse_to_ast(source), source=source)
withpatch.object(_time, "sleep", side_effect=KeyboardInterrupt):
execute(result) # <-- reliably aborts with the malloc trampoline crash
This narrows the hypothesis space dramatically:
It's NOT about deep recursion (the program above does ONE IO.sleep call).
It's NOT about heap corruption from any Vera codegen bug (the program is trivial; the corruption is in wasmtime / libmalloc).
It's NOT about scale (one host call, one Python exception).
It's NOT specific to actual SIGINT — any KeyboardInterrupt raised inside a host import triggers it.
Worth filing upstream against wasmtime-py as a minimal reproducer.
Worth testing the same reproducer under Python 3.13 to isolate any 3.14-specific ABI gap (Python 3.14's ctypes refactoring is the prime suspect).
The PR #594 production guard (catching KeyboardInterrupt and converting to _VeraExit(130) before it can escape the host import) closes the user-visible half by ensuring the production path never reaches this trigger. The underlying wasmtime/Python bug remains.
Severity
Medium — but possibly an early surface of #593's underlying corruption. Severity escalates if the same heap corruption can be triggered by a Vera program without manual interruption.
Acceptance
Reproducer above runs to completion without malloc abort.
Summary
When a Vera program is interrupted (or possibly during normal execution — see updated analysis), the Python process aborts with a macOS malloc error inside wasmtime's host-function trampoline:
The user sees a macOS "Python quit unexpectedly" popup. The fix for the related Python
KeyboardInterrupttraceback is shipped in v0.0.137 (host_sleep catchesKeyboardInterruptand raises_VeraExit(130)for clean exit). The malloc abort is a separate, lower-level issue that may persist even after the Python-traceback fix lands.Updated diagnosis (with crash report stack trace)
The full macOS crash report points the abort at a very specific call site:
This rules out my earlier "cleanup-path" hypothesis: the abort happens inside
wasmtime::runtime::func::HostFunc::array_call_trampoline(at offset +456), which is wasmtime's trampoline that wraps host imports. The trampoline:The
+456offset places us AFTER the host callback returned (or threw), in the cleanup/return phase. Memory the trampoline allocated for the call is being freed, but the freed pointer wasn't malloc'd by the same allocator.Combined with the 24 frames of unsymbolicated WASM code at the same offset (suggesting 24-deep recursion through
run_loop), the crash signature is consistent with: the deep WASM recursion has corrupted some memory wasmtime depends on, and the corruption surfaces when the host trampoline tries to clean up after a host call.Revised hypothesis (likely related to #593)
The previous hypothesis listed three possibilities. The stack trace narrows it:
wasmtime-py callback teardown ordering— likely NOT the cleanup ordering itself; the abort is mid-trampoline, not at process exit.Outstanding shadow-stack root not cleared— possible but doesn't directly explain malloc/free mismatch.Native callback re-entrancy— possible but the trace shows a single call stack, not signal-handler reentry.This hypothesis is supported by:
Possibly Python-3.14-related?
The user's Python is 3.14.3, released October 2025. Python 3.14 included significant ctypes refactoring. wasmtime-py may not yet be hardened against the new ctypes ABI behaviour. Worth testing the same reproducer under Python 3.13 to see if the abort still fires — if not, this is partially a wasmtime-py-vs-Python-3.14 ABI gap.
Reproducer
Run any Vera program that (a) recurses deeply with allocating arguments, (b) uses host imports (especially
IO.sleep,IO.print), and (c) hits #593's heap-corruption trigger. The simplest:vera run /Users/aa/Downloads/files/life_full_program.vera # Wait through generations 0-50, then Ctrl-C OR let it run to completionThe malloc abort fires reliably once the Life program reaches the corruption window from #593.
DETERMINISTIC REPRODUCER (added 2026-05-07)
While testing the
IO.sleepKeyboardInterruptguard fix in PR #594, I temporarily reverted the guard and ran the e2e test to confirm it caught the regression. The test triggered an immediateSIGABRTmatching this issue's stack trace exactly:This is the SAME signature this issue documents — but reproducible via a 5-line setup, no Life program / 200 generations / manual Ctrl-C required.
Reproducer:
This narrows the hypothesis space dramatically:
IO.sleepcall).KeyboardInterruptraised inside a host import triggers it.The bug is a pure interaction between:
HostFunc::array_call_trampoline(Rust)KeyboardInterrupt(or any Python exception?) escaping the host callback unexpectedlyImplications:
errors="replace"fix can mask. This abort is fully synthetic and doesn't need any Life program at all.The PR #594 production guard (catching
KeyboardInterruptand converting to_VeraExit(130)before it can escape the host import) closes the user-visible half by ensuring the production path never reaches this trigger. The underlying wasmtime/Python bug remains.Severity
Medium — but possibly an early surface of #593's underlying corruption. Severity escalates if the same heap corruption can be triggered by a Vera program without manual interruption.
Acceptance
Workaround
None known.
Related
host_sleepKeyboardInterrupt→_VeraExit(130)fix (Python-traceback half; lands separately in PR v0.0.137: fix #588 (captured-Array indexing in closure produces invalid WASM) #594).