Skip to content

Active reclamation of host-store handles (deferred from #346) #573

@aallan

Description

@aallan

Context

Splits out from #346. The original issue's title was "Opaque handle memory leak in host stores" and grouped naturally with #347 / #490 (also opaque-handle hygiene). In v0.0.132 (PR #572) the codegen-time half (#347 + #490) shipped via a _is_host_handle_type classifier that excludes Map / Set / Decimal handles from GC-rooting decisions. This issue tracks the residual: active reclamation of unreachable handles from Python-side stores.

The leak

_map_store, _set_store, _decimal_store in vera/codegen/api.py are append-only. Every map_insert / set_add / decimal_* operation allocates a new handle without releasing transient predecessors. Across an execute() call:

let @Map<Nat, Nat> = map_new();                   -- handle 1
let @Map<Nat, Nat> = map_insert(@Map<Nat, Nat>.0, 1, 100);  -- handle 2
let @Map<Nat, Nat> = map_insert(@Map<Nat, Nat>.0, 2, 200);  -- handle 3

After this, _map_store has 3 entries ({1: {}, 2: {1: 100}, 3: {1: 100, 2: 200}}) even though only handle 3 is live. Handles 1 and 2 are unreachable but never freed until execute() returns and Python GC reclaims the entire store dict.

Bounded by Python GC at execute() exit, so single-shot programs are unaffected. Matters for long-running execution contexts (server programs, repeated execute() calls in a single process).

Why the v0.0.132 PR didn't ship a fix

An earlier draft of PR #572 attempted to close #346 with a host_gc_sweep host import that walked the live Vera heap + shadow stack to identify reachable handle indices. The resulting design grew six interlocking pieces:

  1. Heap walk (parse object headers, scan payloads for handle indices)
  2. Shadow stack scan (handles rooted as params/captures/accumulators)
  3. Transitive closure (a Map containing Set handles must keep those Sets live)
  4. Re-entrancy guard (_in_host_alloc flag — sweep skips when called from inside a Python helper that's currently allocating, since in-flight handles are on the Python interpreter stack)
  5. Shadow_push at let-binding time for host-handle types (host functions returning handles bypass the ADT-constructor push)
  6. JSON/HTML emission gates (those paths populate _map_store too)

Each piece was necessary for the previous one to work, which is a smell. The complexity was disproportionate to the practical impact — Vera doesn't yet have long-running execution contexts where the leak matters in practice, and a future maintainer touching any one piece could subtly break the others.

Recommended design: heap-wrap-as-ADT

Instead of running a parallel reclamation system, integrate handle reclamation into the existing mark-sweep GC. Sketch:

  1. Define synthetic ADTs MapHandle(i32), SetHandle(i32), DecimalHandle(i32) — one i32 payload per handle.
  2. host_map_new, host_set_new, host_decimal_alloc return a Vera-heap pointer to the wrapped ADT (single-field allocation, marked at the construction site like any other ADT).
  3. Vera-side code receives the ADT pointer; uses pattern matching or auto-derived accessor to extract the i32 handle when calling subsequent host ops.
  4. When $gc_collect sweeps an unreachable MapHandle/SetHandle/DecimalHandle ADT, it emits a destructor callback (a new host import like host_decref_handle(kind: i32, idx: i32)) that removes the entry from the corresponding Python-side store.

Trade-offs vs. the parallel-sweep design:

Aspect parallel sweep (host_gc_sweep) heap-wrap-as-ADT (this proposal)
Implementation complexity 6 interlocking pieces 1 mechanism (destructor callback hooked into existing sweep)
Re-entrancy concerns Yes (#572 round 2) No (sweep already runs at well-defined moments)
Conservative-scan false positives Yes (Int payload values matching handle indices over-retain) No (handle's reachability follows the wrapping ADT)
Host-call signature changes None Every Map/Set/Decimal-returning op signature changes (i32 → i32 ptr)
Vera source-level visibility Transparent Adds MapHandle / SetHandle / DecimalHandle types (might want to alias to keep Map<K, V> shorthand working)
Cost of getting it wrong Pruning live handle = KeyError or corruption Destructor leak = same per-execute() leak we have now (no regression)

The wrapping-ADT path is a larger one-time refactor of the host helpers but eliminates a whole class of correctness traps.

Scope

  • All host ops that currently return raw handle indices: map_new, map_insert, map_remove, set_new, set_add, set_remove, decimal_from_int, decimal_from_string, decimal_add, decimal_sub, decimal_mul, decimal_div, etc.
  • The _handle_value_ints traversal pattern from PR v0.0.132 — Opaque-handle GC-rooting hygiene (closes #347 and #490; #346 deferred to #573) #572's draft can be re-used to handle nested handles (Map containing Set handles as values) — when the destructor fires for an outer Map, scan its values for handles to decref.
  • Browser parity: mirror in vera/browser/runtime.mjs.
  • JSON / HTML use _map_store for JObject / HtmlElement attrs — would need to wrap those allocations through the same ADT path or accept the JSON/HTML-only programs leak (probably acceptable).

Acceptance

  • Long-running scenario test: a loop of 10 000 map_insert chains where only the final Map is reachable should leave _map_store size ≤ 100 (or some small constant), not 10 000.
  • All existing host-store tests still pass.
  • Browser parity tests pass.

Out of scope

This issue tracks only the active reclamation work. The codegen-time half (#347 + #490) shipped in v0.0.132.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions