Skip to content

Map<K, T_heap> values stored in _map_store are invisible to conservative GC scan — post-walk reachability of JObject children #695

@aallan

Description

@aallan

Summary

Map<K, T_heap> values (heap-allocated WASM blocks stored as Python ints in _map_store[handle]) are invisible to the conservative GC scan. A $gc_collect triggered after the map is constructed will reclaim those blocks, leaving map_get returning pointers to freed memory.

Surfaced as an outside-diff observation by CodeRabbit on PR #693 (the #692 host-walker GC rooting fix) — specifically pointing at the JObject branch of vera/wasm/json_serde.py::write_json. The concern is real but pre-dates #692 and applies to the broader Map<K, T_heap> contract, so deferring as a separate issue rather than expanding #693.

Root cause

_alloc_map_wrapper (vera/codegen/api.py) stores its argument dict in a Python-side _map_store[handle]. For Map<String, Json> or Map<String, HtmlNode> the values are i32 heap pointers — but they only exist in Python memory, not WASM linear memory.

The conservative scan in $gc_collect Phase 2a walks the WASM shadow stack and any reachable heap blocks looking for i32-shaped pointers in the heap range. It never enters _map_store. The wrapper ADT carries the raw handle ORed with 0x80000000 (per #578) at body+4, which is structurally outside the heap range and correctly skipped by the scan — but that means the handle is dead-end from a tracing perspective. Heap blocks pointed to only from _map_store[handle] are unmarked → sweep reclaims them.

Which Map element types are affected

  • ✅ Vulnerable: Map<K, Json>, Map<K, HtmlNode>, Map<K, Md*>, any user Map<K, T> where T is heap-allocated (e.g. Result<...>, Option<...>, ADTs, lists, arrays, …)
  • ✅ Safe: Map<K, V_inline> where values are inline scalars (Int, Bool, Float, etc.) — those are stored as Python ints/bools and never reach the WASM heap
  • ✅ Safe: Map<K, String> — strings are stored as Python str in _map_store, then re-encoded into WASM memory on map_get. HtmlElement.attrs is in this category, which is why no html_parse and json_parse trap with Out-of-bounds memory access on inputs that pressure GC during host-side tree walk #692-equivalent bug fired there

Reproducer (sketch — not yet verified)

let @Json = json_parse("{\"key\": [1,2,3,4,5,6,7,8,9,10]}");
match @Json.0 {
  Ok(@Json) -> {
    -- After parse: wrapper_ptr is rooted via @Json; _map_store[handle]["key"]
    -- points to a JArray heap block that is NOT reachable from the WASM scan.
    -- Force GC pressure with a large unrelated alloc:
    let @Array<Int> = array_range(0, 100000);
    -- That alloc grows memory → triggers gc_collect → frees the JArray block
    -- referenced from _map_store["key"].
    let @Option<Json> = json_get(@Json.0, "key");
    -- @Option now wraps a pointer to freed memory.  Subsequent access:
    match @Option<Json>.0 {
      Some(@Json) -> {
        let @Int = json_array_length(@Json.0);
        -- Either traps or returns garbage depending on what landed at the freed slot.
        IO.print(int_to_string(@Int.0))
      },
      None -> IO.print("none")
    }
  },
  Err(_) -> IO.print("err")
}

The reporter should verify whether this reliably traps or returns garbage; the contract is broken either way.

Fix options

Option A — host-side WASM container per map entry. In _alloc_map_wrapper, for each value that is a heap pointer, allocate a tiny WASM container holding the i32, store the container pointer in _map_store[handle] instead of the raw int. Conservative scan reaches the container via the wrapper ADT and follows the contained pointer.

Option B — Extend GC tracing to walk _map_store. Phase 2c (which already iterates the wrap-table looking for unreachable wrappers) gains a second sweep: for each REACHABLE Map wrapper, mark every heap-pointer-shaped value in _map_store[handle]. Requires the host to know which values are heap pointers vs inline scalars (currently _map_store is untyped).

Option C — Don't store heap pointers in _map_store. All values are serialised into WASM memory at insertion time, deserialised at access. Heaviest change; possibly best long-term.

Option A is the surgical fix. Option B requires per-handle type info. Option C changes the Map contract.

Why this didn't trip the #692 fix's tests

The four TestHostWalkerGCRooting692 tests + the conformance test exercise json_parse then immediately match-and-print. No allocation happens between json_parse returning and the program exiting, so GC never fires post-walk. The val_ptrs in _map_store are technically leaked but the program ends before anything observes it.

A test that does json_parse(...) → match → trigger_large_alloc → map_get(..., "key") would surface the bug. Adding such a test is part of the work for this issue.

Workaround for users today

For inputs that genuinely require Map<K, T_heap> semantics: keep the map structurally shallow and avoid intermediate allocations between map construction and use. For JSON specifically: prefer JArray over JObject where the data model allows (JArray's backing IS in WASM memory and visible to GC).

Suggested next steps

  1. Verify the reproducer above actually triggers the bug (write a unit test that fires GC between json_parse and json_get).
  2. Pick a fix option (A is the surgical default).
  3. Implement + add a regression test.
  4. Audit other Map<K, T_heap> usage in the standard library / examples for shape-similar code.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions