Skip to content

Over-rooting of host-managed handles in array_fold (and future iterative combinators) #490

@aallan

Description

@aallan

Context

While reviewing #489 (iterative array_fold), CodeRabbit flagged that my ADT-rooting heuristic treats every i32 accumulator that isn't Bool/Byte as a GC-managed heap pointer:

u_is_adt = (
    u_wasm == "i32"
    and u_type not in ("Bool", "Byte")
    and not u_is_pair
)
u_needs_root = u_is_pair or u_is_adt

This over-roots host-managed opaque handles — Map, Set, Regex, and Decimal handles are small host-side integers (see host_map_new in vera/codegen/api.py:1037 which returns _map_alloc({}), an index into a Python-side dict). They don't point into the Vera heap, so shadow-rooting them is wasted work.

Why not fixed in #489

Shadow-rooting a host handle is safe: the conservative GC just checks whether the value falls in [gc_heap_start + 4, heap_ptr) with alignment (val - heap_start) % 8 == 4. A small int like 5 or 17 is rejected as out-of-range and nothing spurious is marked. The worst case is a handle value that happens to land in heap range with matching alignment — a false-positive mark on some unrelated heap object, which retains memory spuriously but doesn't corrupt.

Under-rooting would be a real bug (silent GC of a live Vera ADT, same class as #464). The current heuristic errs on the safer side.

Proposed fix

Add a type-category predicate that distinguishes Vera-heap types from host-managed handles:

def is_gc_managed(vera_type: str) -> bool:
    """True when the type is represented as a pointer into the Vera GC heap."""
    # Pair types (String, Array<T>): ptr + len, ptr is GC-managed
    if _is_pair_element_type(vera_type):
        return True
    # Host-managed opaque handles: i32 index into host-side data
    if vera_type in {"Map", "Set", "Regex", "Decimal"}:
        return False
    # Primitives
    if vera_type in {"Int", "Nat", "Float64", "Bool", "Byte", "Unit"}:
        return False
    # Everything else (ADTs, Option, Result, Json, Html, user data types) is
    # Vera-heap allocated.
    return True

Use it in _translate_array_fold (and anywhere else that makes GC-rooting decisions by wasm type alone). Needs a test for each category: Int (skip), Map (skip), Option (root), user ADT (root), String (root as pair).

Scope

Small, self-contained. Good issue for anyone coming fresh to the WASM codegen. Doesn't block any user-facing feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions