Summary
array_map(arr, fn) over a heap-allocating closure body (e.g. MkBox(@Int.0)) traps on GC shadow-stack overflow at around 4000 elements. Each iteration's gc_shadow_push consumes a slot on the 16 KiB / 4096-entry shadow stack and the per-iteration push is not unwound within the loop, so allocations across many iterations leak shadow-stack slots until the limit is hit.
Surfaced during work on #348 (GC worklist overflow); the natural runtime regression test for #348 (build a wide Array<ADT> and verify mark-phase correctness) hits this issue first.
Reproducer
private data Box { MkBox(Int) }
public fn main(@Unit -> @Int)
requires(true) ensures(true) effects(pure)
{
let @Array<Box> = array_map(
array_range(0, 5000),
fn(@Int -> @Box) effects(pure) { MkBox(@Int.0) }
);
array_fold(
@Array<Box>.0,
0,
fn(@Int, @Box -> @Int) effects(pure) {
@Int.0 + match @Box.0 { MkBox(@Int) -> @Int.0 }
}
)
}
$ vera run /tmp/repro.vera
Error: Reached `unreachable` WASM instruction
Source backtrace:
in anon_0 (line 7) ;; the inner MkBox closure
in main
Threshold (verified empirically): N=1000 ✓, N=3000 ✓, N=5000 ✗.
Root cause
The lifted closure body for MkBox(@Int.0):
- Saves
$gc_sp to gc_sp_save (per-call prologue)
- Allocates the Box via
$alloc; the new ptr is gc_shadow_push-ed
- Returns; the epilogue restores
$gc_sp from gc_sp_save, unwinding the push
That per-call balance is correct — the bug is upstream. array_map's loop in the host runtime calls the lifted closure once per element, and the outer lifted-function frame for array_map's body keeps each newly-allocated Box rooted on the shadow stack to survive subsequent $alloc calls in the same iteration. Those roots stack up across iterations because array_map doesn't have a "release per-iteration roots after the element is stored into the destination array" hook.
So the problem isn't that any single function leaks — it's that array_map's iterative builder keeps each per-element root alive until the whole loop completes, and 4096 such pushes exhaust the shadow stack.
Possible fixes
- Per-iteration unwind in
array_map codegen. After the destination element is stored, restore $gc_sp to a per-iteration save point. Mirrors the function-call epilogue pattern but at loop scope.
- Larger shadow stack. The stack is currently 16 KiB / 4096 entries. Doubling/quadrupling it would push the threshold higher but doesn't eliminate the hole.
- Eliminate the per-element shadow root entirely by writing the new element directly into the destination array slot (the array itself is rooted) and not pushing the transient.
Option 3 is the cleanest if feasible — it's a codegen change in array_map (and likely sibling builders like array_filter, array_mapi, array_flatten that follow the same pattern).
Affects
array_map and likely other iterative builders (array_filter, array_mapi, array_flatten, array_fold with allocating accumulators).
- Programs that build large arrays of heap-allocated values via these built-ins.
- Surfaces as a generic "Reached
unreachable WASM instruction" trap at runtime; the source backtrace points at the inner closure body, not at the GC stack-overflow check, so the failure mode is somewhat opaque to users.
Workaround
Pre-compute the size cap and split into chunks of ≤4000 via manual recursion, OR avoid heap-allocated element types in large arrays.
Related
- Surfaced while working on #348 (GC worklist overflow).
- Same shadow-stack subsystem as #347 (opaque-handle parameters incorrectly pushed) but a different symptom.
Summary
array_map(arr, fn)over a heap-allocating closure body (e.g.MkBox(@Int.0)) traps on GC shadow-stack overflow at around 4000 elements. Each iteration'sgc_shadow_pushconsumes a slot on the 16 KiB / 4096-entry shadow stack and the per-iteration push is not unwound within the loop, so allocations across many iterations leak shadow-stack slots until the limit is hit.Surfaced during work on #348 (GC worklist overflow); the natural runtime regression test for #348 (build a wide
Array<ADT>and verify mark-phase correctness) hits this issue first.Reproducer
Threshold (verified empirically): N=1000 ✓, N=3000 ✓, N=5000 ✗.
Root cause
The lifted closure body for
MkBox(@Int.0):$gc_sptogc_sp_save(per-call prologue)$alloc; the new ptr isgc_shadow_push-ed$gc_spfromgc_sp_save, unwinding the pushThat per-call balance is correct — the bug is upstream.
array_map's loop in the host runtime calls the lifted closure once per element, and the outer lifted-function frame forarray_map's body keeps each newly-allocated Box rooted on the shadow stack to survive subsequent$alloccalls in the same iteration. Those roots stack up across iterations becausearray_mapdoesn't have a "release per-iteration roots after the element is stored into the destination array" hook.So the problem isn't that any single function leaks — it's that
array_map's iterative builder keeps each per-element root alive until the whole loop completes, and 4096 such pushes exhaust the shadow stack.Possible fixes
array_mapcodegen. After the destination element is stored, restore$gc_spto a per-iteration save point. Mirrors the function-call epilogue pattern but at loop scope.Option 3 is the cleanest if feasible — it's a codegen change in
array_map(and likely sibling builders likearray_filter,array_mapi,array_flattenthat follow the same pattern).Affects
array_mapand likely other iterative builders (array_filter,array_mapi,array_flatten,array_foldwith allocating accumulators).unreachableWASM instruction" trap at runtime; the source backtrace points at the inner closure body, not at the GC stack-overflow check, so the failure mode is somewhat opaque to users.Workaround
Pre-compute the size cap and split into chunks of ≤4000 via manual recursion, OR avoid heap-allocated element types in large arrays.
Related