Context
Surfaced by CodeRabbit on PR #577 (#573 phase 1-3 / heap-wrap-as-ADT migration).
$register_wrapper (in vera/codegen/assembly.py) currently traps with unreachable when the wrap-table is full (4 096 simultaneously-live entries × 16 bytes = 64 KiB region):
global.get $gc_wrap_ptr
global.get $gc_wrap_end
i32.ge_u
if
unreachable
end
The reviewer's suggestion: instead of trapping, try $gc_collect first to compact the wrap-table (Phase 2c drops dead entries and compacts survivors in place); only trap if the table is still full after compaction.
Why this is a real limitation
For programs that create wrappers faster than other heap allocations, the wrap-table could fill before the heap does — at which point $register_wrapper would trap even though Phase 2c would have compacted most entries. A 4 096-element burst of wrapper allocations between any two $gc_collect events triggers the trap.
Why current tests don't hit it in practice
Every wrapper IS a heap allocation (8-byte body + 4-byte header ≈ 16 bytes). 4 096 wrappers = ~64 KiB of heap consumed. The heap starts at one page (64 KiB) and grows, with $alloc triggering GC when full. So in practice both fill at similar rates — GC fires before the trap. The 10K-iter array_fold stress tests in TestHostHandleReclamation573 validate this empirically (10× the wrap-table capacity, no trap, post-GC residual = 1 for Map / < 2K for Set / < 1.5K for Decimal).
Fix design
Two-step register: on overflow, root the new wrapper, call $gc_collect, re-check, only trap if still full.
;; on overflow, save the in-flight wrapper as a temporary root
global.get $gc_wrap_ptr
global.get $gc_wrap_end
i32.ge_u
if
;; (a) push the in-flight wrapper ptr on the shadow stack
;; so $gc_collect doesn't sweep it
global.get $gc_sp
local.get $ptr
i32.store
global.get $gc_sp
i32.const 4
i32.add
global.set $gc_sp
;; (b) collect (compacts wrap-table via Phase 2c)
call $gc_collect
;; (c) pop the temporary root
global.get $gc_sp
i32.const 4
i32.sub
global.set $gc_sp
;; (d) re-check; trap if still full
global.get $gc_wrap_ptr
global.get $gc_wrap_end
i32.ge_u
if
unreachable
end
end
;; ... existing append logic ...
Subtleties to design:
- Re-entrancy:
$register_wrapper is called from inside $alloc-triggered code paths (every Map/Set/Decimal call-site). Calling $gc_collect from inside it must be safe under whatever shadow-stack state the caller has set up. The shadow-push immediately before the collect roots the in-flight wrapper, but other in-flight values on the operand stack at the call site may not be rooted.
- Cost on the hot path: every
$register_wrapper call now has a conditional branch and (if hit) a full GC. In practice the slow path fires only when near-full, so amortised cost is low.
- Test coverage: a new stress test that creates 5K+ wrappers per
$gc_collect interval (e.g. by doing wrap allocations while suppressing other heap allocations) would exercise the slow path. Without such a test, the fix is unobservable in normal use.
Acceptance
$register_wrapper no longer traps on first fullness; instead it triggers $gc_collect, re-checks, and only traps if still full after compaction.
- A new regression test creates a wrapper-heavy synthetic load (e.g. tight loop wrapping handles in scratch ADTs that go dead immediately, with no other heap pressure) and verifies it runs cleanly past the 4 096-wrapper threshold.
- Existing
TestHostHandleReclamation573 regressions still pass (since their behaviour is unchanged — they never hit the slow path).
Why low priority
Bounded by heap fill rate. No reported program has hit the trap. The trap behaviour is preferable to a silent leak, so the current design is conservative-correct. Tracked for the case where someone hits the trap on a real workload.
Context
Surfaced by CodeRabbit on PR #577 (#573 phase 1-3 / heap-wrap-as-ADT migration).
$register_wrapper(invera/codegen/assembly.py) currently traps withunreachablewhen the wrap-table is full (4 096 simultaneously-live entries × 16 bytes = 64 KiB region):The reviewer's suggestion: instead of trapping, try
$gc_collectfirst to compact the wrap-table (Phase 2c drops dead entries and compacts survivors in place); only trap if the table is still full after compaction.Why this is a real limitation
For programs that create wrappers faster than other heap allocations, the wrap-table could fill before the heap does — at which point
$register_wrapperwould trap even though Phase 2c would have compacted most entries. A 4 096-element burst of wrapper allocations between any two$gc_collectevents triggers the trap.Why current tests don't hit it in practice
Every wrapper IS a heap allocation (8-byte body + 4-byte header ≈ 16 bytes). 4 096 wrappers = ~64 KiB of heap consumed. The heap starts at one page (64 KiB) and grows, with
$alloctriggering GC when full. So in practice both fill at similar rates — GC fires before the trap. The 10K-iterarray_foldstress tests inTestHostHandleReclamation573validate this empirically (10× the wrap-table capacity, no trap, post-GC residual = 1 for Map / < 2K for Set / < 1.5K for Decimal).Fix design
Two-step register: on overflow, root the new wrapper, call
$gc_collect, re-check, only trap if still full.Subtleties to design:
$register_wrapperis called from inside$alloc-triggered code paths (every Map/Set/Decimal call-site). Calling$gc_collectfrom inside it must be safe under whatever shadow-stack state the caller has set up. The shadow-push immediately before the collect roots the in-flight wrapper, but other in-flight values on the operand stack at the call site may not be rooted.$register_wrappercall now has a conditional branch and (if hit) a full GC. In practice the slow path fires only when near-full, so amortised cost is low.$gc_collectinterval (e.g. by doing wrap allocations while suppressing other heap allocations) would exercise the slow path. Without such a test, the fix is unobservable in normal use.Acceptance
$register_wrapperno longer traps on first fullness; instead it triggers$gc_collect, re-checks, and only traps if still full after compaction.TestHostHandleReclamation573regressions still pass (since their behaviour is unchanged — they never hit the slow path).Why low priority
Bounded by heap fill rate. No reported program has hit the trap. The trap behaviour is preferable to a silent leak, so the current design is conservative-correct. Tracked for the case where someone hits the trap on a real workload.