Architectural follow-up to the mirror fix landing in the PR that closes #695 and #705.
Context
The mirror fix closes the immediate bugs correctness-wise:
- The conservative GC scan now reaches heap-pointer values via shadow stack → wrapper → WASM-resident bucket array → val_ptr.
- The Python-side
_map_store and _set_store remain the source of truth for actual map/set contents.
- The bucket array is a write-only mirror, populated by
host_attach_bucket (CLI) and imports.vera.attach_bucket_to_wrapper (browser).
The architectural debt: data lives in two places. Drift between the Python store and the WASM bucket is possible if a future change writes to one but not the other. Code paths multiply. Browser parity required reimplementing the population logic in JavaScript.
Goal: move to bucket-as-truth
Delete _map_store and _set_store. Make the WASM bucket array the sole source of truth. Host imports take wrapper_ptr (not opaque handle) and read/write the bucket directly. The wrapper IS the map / set value.
Three places this needs to land
-
CLI Map host imports (vera/codegen/api.py):
host_map_new, host_map_size, _define_map_insert, _define_map_get, _define_map_contains, _define_map_remove, _define_map_keys, _define_map_values — all 8 currently take handle and read _map_store[handle]. Move: take wrapper_ptr, read bucket via wrapper_ptr + 8 (offset to bucket_ptr), use _dict_from_bucket to decode and _build_map_wrapper to return a new wrapper.
- Delete
_map_store, _map_alloc, host_attach_bucket (Map branch).
- Codegen update (
vera/wasm/calls_containers.py): drop _emit_unwrap_handle and _emit_wrap_handle for Map call sites — replace the post-call wrap with a simpler shadow-root sequence that pushes the returned wrapper_ptr onto the shadow stack.
-
CLI Set host imports (vera/codegen/api.py): same pattern as Map. host_set_new, host_set_size, _define_set_add, _define_set_contains, _define_set_remove, _define_set_to_array.
-
Browser runtime (vera/browser/runtime.mjs): JS parallel. Delete mapStore and setStore JS Maps; rewrite all imports.vera.map_* and imports.vera.set_* to use the WASM bucket layout. Equivalent encode/decode helpers in JS.
Bucket layout (preparatory work, not in the mirror PR)
The mirror PR ships with a 12-byte slot layout (key_word_0, key_word_1, val_word) and no bucket header. For the move, the layout needs to grow:
- Slot size: 20 bytes (occupancy flag at +0, key low/high at +4/+8, val low/high at +12/+16). The occupancy flag lets non-string keys distinguish empty vs live without relying on
key_word_0 == 0 (which fails for Int 0 keys). Val word pair lets string values store (ptr, len) inline without an extra heap allocation.
- Bucket header: 8 bytes (capacity at +0, count at +4). Lets
map_size return in O(1) via header.count instead of scanning slots.
Decimal exempt
Decimal is value-typed (PyDecimal in Python, BigInt in JS) — no heap pointers in the store entry, so the #695 class of bug cannot apply. The bucket_ptr field on Decimal wrappers stays 0 forever; host_attach_bucket short-circuits when kind == 3. Phase D leaves Decimal alone.
Performance challenge (what blocks the move)
A first-pass implementation hit O(N²) in pure-Python decode/encode: per-slot reads through caller["memory"].data_ptr(store)[addr:addr+4] are ~5μs each, so a 10000-element map_insert chain hangs (tens of minutes vs the old _map_store dict-copy at low-microsecond constants).
The fix is to batch the whole bucket region in one wasmtime memory access plus one struct.unpack / struct.pack call instead of per-i32. The encoders for string keys / string values still need per-string _alloc_string calls, but the slot writes can be batched at the end. Mechanical work, but ~1–2 hours of careful instrumentation to confirm the 10000-chain test (TestHostHandleReclamation573::test_map_chain_reclaims_transients) runs in reasonable time.
Acceptance
_map_store and _set_store deleted.
- JSON parser path (
_alloc_map_wrapper for Map<String, Json>) routes through the new bucket builder.
- All existing tests pass, including the 10000-element reclamation chain.
- Browser tests pass.
- No
mapStore / setStore in runtime.mjs.
- Architecturally, the WASM bucket array is the single source of truth.
References
Architectural follow-up to the mirror fix landing in the PR that closes #695 and #705.
Context
The mirror fix closes the immediate bugs correctness-wise:
_map_storeand_set_storeremain the source of truth for actual map/set contents.host_attach_bucket(CLI) andimports.vera.attach_bucket_to_wrapper(browser).The architectural debt: data lives in two places. Drift between the Python store and the WASM bucket is possible if a future change writes to one but not the other. Code paths multiply. Browser parity required reimplementing the population logic in JavaScript.
Goal: move to bucket-as-truth
Delete
_map_storeand_set_store. Make the WASM bucket array the sole source of truth. Host imports takewrapper_ptr(not opaquehandle) and read/write the bucket directly. The wrapper IS the map / set value.Three places this needs to land
CLI Map host imports (
vera/codegen/api.py):host_map_new,host_map_size,_define_map_insert,_define_map_get,_define_map_contains,_define_map_remove,_define_map_keys,_define_map_values— all 8 currently takehandleand read_map_store[handle]. Move: takewrapper_ptr, read bucket viawrapper_ptr + 8(offset tobucket_ptr), use_dict_from_bucketto decode and_build_map_wrapperto return a new wrapper._map_store,_map_alloc,host_attach_bucket(Map branch).vera/wasm/calls_containers.py): drop_emit_unwrap_handleand_emit_wrap_handlefor Map call sites — replace the post-call wrap with a simpler shadow-root sequence that pushes the returnedwrapper_ptronto the shadow stack.CLI Set host imports (
vera/codegen/api.py): same pattern as Map.host_set_new,host_set_size,_define_set_add,_define_set_contains,_define_set_remove,_define_set_to_array.Browser runtime (
vera/browser/runtime.mjs): JS parallel. DeletemapStoreandsetStoreJS Maps; rewrite allimports.vera.map_*andimports.vera.set_*to use the WASM bucket layout. Equivalent encode/decode helpers in JS.Bucket layout (preparatory work, not in the mirror PR)
The mirror PR ships with a 12-byte slot layout (
key_word_0,key_word_1,val_word) and no bucket header. For the move, the layout needs to grow:key_word_0 == 0(which fails for Int 0 keys). Val word pair lets string values store (ptr, len) inline without an extra heap allocation.map_sizereturn in O(1) via header.count instead of scanning slots.Decimal exempt
Decimalis value-typed (PyDecimalin Python,BigIntin JS) — no heap pointers in the store entry, so the #695 class of bug cannot apply. Thebucket_ptrfield on Decimal wrappers stays 0 forever;host_attach_bucketshort-circuits whenkind == 3. Phase D leaves Decimal alone.Performance challenge (what blocks the move)
A first-pass implementation hit O(N²) in pure-Python decode/encode: per-slot reads through
caller["memory"].data_ptr(store)[addr:addr+4]are ~5μs each, so a 10000-elementmap_insertchain hangs (tens of minutes vs the old_map_storedict-copy at low-microsecond constants).The fix is to batch the whole bucket region in one wasmtime memory access plus one
struct.unpack/struct.packcall instead of per-i32. The encoders for string keys / string values still need per-string_alloc_stringcalls, but the slot writes can be batched at the end. Mechanical work, but ~1–2 hours of careful instrumentation to confirm the 10000-chain test (TestHostHandleReclamation573::test_map_chain_reclaims_transients) runs in reasonable time.Acceptance
_map_storeand_set_storedeleted._alloc_map_wrapperforMap<String, Json>) routes through the new bucket builder.mapStore/setStoreinruntime.mjs.References