Context
While reviewing #489 (iterative array_fold), CodeRabbit flagged that my ADT-rooting heuristic treats every i32 accumulator that isn't Bool/Byte as a GC-managed heap pointer:
u_is_adt = (
u_wasm == "i32"
and u_type not in ("Bool", "Byte")
and not u_is_pair
)
u_needs_root = u_is_pair or u_is_adt
This over-roots host-managed opaque handles — Map, Set, Regex, and Decimal handles are small host-side integers (see host_map_new in vera/codegen/api.py:1037 which returns _map_alloc({}), an index into a Python-side dict). They don't point into the Vera heap, so shadow-rooting them is wasted work.
Why not fixed in #489
Shadow-rooting a host handle is safe: the conservative GC just checks whether the value falls in [gc_heap_start + 4, heap_ptr) with alignment (val - heap_start) % 8 == 4. A small int like 5 or 17 is rejected as out-of-range and nothing spurious is marked. The worst case is a handle value that happens to land in heap range with matching alignment — a false-positive mark on some unrelated heap object, which retains memory spuriously but doesn't corrupt.
Under-rooting would be a real bug (silent GC of a live Vera ADT, same class as #464). The current heuristic errs on the safer side.
Proposed fix
Add a type-category predicate that distinguishes Vera-heap types from host-managed handles:
def is_gc_managed(vera_type: str) -> bool:
"""True when the type is represented as a pointer into the Vera GC heap."""
# Pair types (String, Array<T>): ptr + len, ptr is GC-managed
if _is_pair_element_type(vera_type):
return True
# Host-managed opaque handles: i32 index into host-side data
if vera_type in {"Map", "Set", "Regex", "Decimal"}:
return False
# Primitives
if vera_type in {"Int", "Nat", "Float64", "Bool", "Byte", "Unit"}:
return False
# Everything else (ADTs, Option, Result, Json, Html, user data types) is
# Vera-heap allocated.
return True
Use it in _translate_array_fold (and anywhere else that makes GC-rooting decisions by wasm type alone). Needs a test for each category: Int (skip), Map (skip), Option (root), user ADT (root), String (root as pair).
Scope
Small, self-contained. Good issue for anyone coming fresh to the WASM codegen. Doesn't block any user-facing feature.
Context
While reviewing #489 (iterative
array_fold), CodeRabbit flagged that my ADT-rooting heuristic treats every i32 accumulator that isn't Bool/Byte as a GC-managed heap pointer:This over-roots host-managed opaque handles — Map, Set, Regex, and Decimal handles are small host-side integers (see
host_map_newinvera/codegen/api.py:1037which returns_map_alloc({}), an index into a Python-side dict). They don't point into the Vera heap, so shadow-rooting them is wasted work.Why not fixed in #489
Shadow-rooting a host handle is safe: the conservative GC just checks whether the value falls in
[gc_heap_start + 4, heap_ptr)with alignment(val - heap_start) % 8 == 4. A small int like5or17is rejected as out-of-range and nothing spurious is marked. The worst case is a handle value that happens to land in heap range with matching alignment — a false-positive mark on some unrelated heap object, which retains memory spuriously but doesn't corrupt.Under-rooting would be a real bug (silent GC of a live Vera ADT, same class as #464). The current heuristic errs on the safer side.
Proposed fix
Add a type-category predicate that distinguishes Vera-heap types from host-managed handles:
Use it in
_translate_array_fold(and anywhere else that makes GC-rooting decisions by wasm type alone). Needs a test for each category: Int (skip), Map (skip), Option (root), user ADT (root), String (root as pair).Scope
Small, self-contained. Good issue for anyone coming fresh to the WASM codegen. Doesn't block any user-facing feature.