Context
Surfaced by CodeRabbit on PR #577 (#573 phase 1-3 / heap-wrap-as-ADT migration).
After #573, every Map<K, V> / Set<T> / Decimal value is a pointer to an 8-byte wrapper ADT on the GC heap. The wrapper body has the magic tag at offset 0 and the raw host-store handle (a small i32 index) at offset 4.
Phase 2b (mark) of $gc_collect does a conservative word-by-word scan of every reachable object's payload, checking whether each i32 word looks like a heap pointer (in heap range, 8-byte aligned). When it scans a wrapper, it checks both:
- The tag at offset 0 — currently
0xFEEDC001 / 0xFEEDC002 / 0xFEEDC003. These are well above any plausible heap pointer (heap_ptr stays below ~4 GB), so the heap-range check rejects them. Safe.
- The host handle at offset 4 — a small integer (1, 2, 3, ...). In a typical program, heap_ptr starts at 147 KiB and grows, so handle values stay well below
gc_heap_start and the heap-range check rejects them. Safe in practice.
The latent issue
For very-long-running programs (hundreds of thousands of host-store allocations), the handle counter could exceed the initial gc_heap_start value (~147 KiB). At that point a handle whose value happens to match the alignment check (val - gc_heap_start) % 8 == 4 would falsely look like a heap pointer to the conservative scan, which would mark an unrelated heap object as reachable.
This is a retention issue, not a correctness one — the falsely-marked object is still a valid heap object; it just sticks around longer than it should. No use-after-free, no corruption. But for very long sessions it produces unbounded retention.
Fix options
- Self-describing wrappers — store
handle | 0x80000000 at offset 4 instead of the raw handle. The high bit makes the value always above 2^31, well outside heap range. Unwrap site does i32.load offset=4; i32.const 0x7FFFFFFF; i32.and to recover the raw handle. Slightly more expensive on every host call.
- Per-object skip-scan flag — header bit indicating "don't scan this object's payload conservatively, it has no GC children". Touches every header reader.
- Wrap-table cross-reference at scan time — for each scanned object, check if its ptr is in the wrap table; if so, skip. O(n*m) cost where n=heap objects, m=wrap entries. Slow.
- Tighten the heap-range check — track the maximum host-handle value ever allocated and use it as a lower bound for the heap-range check. Cheap, correctness-preserving.
Acceptance
- Conservative scan no longer treats
wrapper_obj.field0 as a candidate heap pointer.
- All existing tests pass.
- A new regression test demonstrates that a host-handle counter exceeding
gc_heap_start doesn't produce spurious retention (compute via direct linear-memory inspection of the heap or by observing that the heap doesn't grow unboundedly under sustained allocation pressure).
Why low priority
Bounded by gc_heap_start (~147 KiB initially) — practical programs allocate <100K host handles per execute() call. Tracked for very-long-running scenarios (server programs, interactive sessions running for hours).
Context
Surfaced by CodeRabbit on PR #577 (#573 phase 1-3 / heap-wrap-as-ADT migration).
After #573, every
Map<K, V>/Set<T>/Decimalvalue is a pointer to an 8-byte wrapper ADT on the GC heap. The wrapper body has the magic tag at offset 0 and the raw host-store handle (a small i32 index) at offset 4.Phase 2b (mark) of
$gc_collectdoes a conservative word-by-word scan of every reachable object's payload, checking whether each i32 word looks like a heap pointer (in heap range, 8-byte aligned). When it scans a wrapper, it checks both:0xFEEDC001/0xFEEDC002/0xFEEDC003. These are well above any plausible heap pointer (heap_ptr stays below ~4 GB), so the heap-range check rejects them. Safe.gc_heap_startand the heap-range check rejects them. Safe in practice.The latent issue
For very-long-running programs (hundreds of thousands of host-store allocations), the handle counter could exceed the initial
gc_heap_startvalue (~147 KiB). At that point a handle whose value happens to match the alignment check(val - gc_heap_start) % 8 == 4would falsely look like a heap pointer to the conservative scan, which would mark an unrelated heap object as reachable.This is a retention issue, not a correctness one — the falsely-marked object is still a valid heap object; it just sticks around longer than it should. No use-after-free, no corruption. But for very long sessions it produces unbounded retention.
Fix options
handle | 0x80000000at offset 4 instead of the raw handle. The high bit makes the value always above 2^31, well outside heap range. Unwrap site doesi32.load offset=4; i32.const 0x7FFFFFFF; i32.andto recover the raw handle. Slightly more expensive on every host call.Acceptance
wrapper_obj.field0as a candidate heap pointer.gc_heap_startdoesn't produce spurious retention (compute via direct linear-memory inspection of the heap or by observing that the heap doesn't grow unboundedly under sustained allocation pressure).Why low priority
Bounded by
gc_heap_start(~147 KiB initially) — practical programs allocate <100K host handles perexecute()call. Tracked for very-long-running scenarios (server programs, interactive sessions running for hours).