Skip to content

Browser runtime: GC reachability fix from #695/#705 doesn't close in browser — bucket array populated but Json still reclaimed #708

@aallan

Description

@aallan

Summary

The browser runtime (vera/browser/runtime.mjs) does not pass the same GC-reachability regression tests that the CLI passes — the bucket-array mirror approach (which closes #695 and #705 on the CLI side) leaves a residual UAF in the browser target.

Reproduction

Surfaced by PR #707's test_eager_gc_*_browser tests (skipped in that PR — see the skip rationale in tests/test_browser.py). Reverting to commit 4d69cb8 (before the PR-review-toolkit C1-C4 fixes) reproduces the same failure, so this is NOT caused by those fixes — it's pre-existing behavior the new tests exposed.

The simplest reproducer (matches the CLI test_eager_gc_set_of_json_post_walk_uaf exactly):

effect IO { op print(String -> Unit); }

private fn build_set(-> @Set<Json>)
  requires(true) ensures(true) effects(pure)
{
  let @Result<Json, String> = json_parse("[1,2,3,4,5,6,7,8,9,10]");
  match @Result<Json, String>.0 {
    Ok(@Json) -> set_add(set_new(), @Json.0),
    Err(@String) -> set_new()
  }
}

public fn main(-> @Unit) requires(true) ensures(true) effects(<IO>) {
  let @Set<Json> = build_set();
  let @Array<Json> = set_to_array(@Set<Json>.0);
  let @Int = array_fold(@Array<Json>.0, 0, fn(@Int, @Json -> @Int) effects(pure) {
    json_array_length(@Json.0) + @Int.0
  });
  IO.print(int_to_string(@Int.0))
}

Compile with VERA_EAGER_GC=1 vera compile --target browser, run with node. The CLI prints 10. The browser prints 0.

What we know (empirical instrumentation)

Adding console.error traces to imports.vera.attach_bucket_to_wrapper and imports.vera.host_decref_handle shows:

DEBUG attach: wrapperPtr=147655, kind=2, handle=1, setStore.get=[], gc_sp=159, heap_ptr=147667
  DEBUG set branch: bucketPtr=147671, bucketBytes=96, elements=[]
  DEBUG after: slot[0]+0=0, ..., wrapperPtr+8=147671
DEBUG attach: wrapperPtr=147775, kind=2, handle=2, setStore.get=[147607], gc_sp=163, heap_ptr=147787
  DEBUG set branch: bucketPtr=147791, bucketBytes=96, elements=[147607]
  DEBUG after: slot[0]+0=147607, ..., wrapperPtr+8=147791
DEBUG decref: kind=2, handle=1, gc_sp=155, heap_ptr=147891
{"stdout":"0",...}

So:

  • The bucket IS populated correctly — slot[0]+0 = 147607 (the Json ptr), wrapperPtr+8 = 147791 (the bucket ptr).
  • Only ONE host_decref_handle fires (for handle=1, the empty wrapper1 from set_new() — expected).
  • setStore[2] = [147607] is preserved.
  • The chain wrapper2 → bucket → slot[0]+0 = Json is in place when set_to_array$eb is called.

Layout sanity check (gc_heap_start = 147603):

  • (147775 - 147603) % 8 == 4 ✓ (wrapper2 alignment OK)
  • (147791 - 147603) % 8 == 4 ✓ (bucket alignment OK)
  • (147607 - 147603) % 8 == 4 ✓ (Json alignment OK)
  • All three pass the val >= gc_heap_start + 4 && val < heap_ptr && (val - heap_start) % 8 == 4 conservative-pointer check.

So in theory the scan should mark wrapper2 → bucket → Json. But empirically Json is reclaimed and json_array_length(Json) reads 0.

Likely investigation paths

  1. Conservative scan in browser may behave differently from CLI — same WAT runs in both, but the WASM execution semantics around memory.grow / DataView aliasing may interact with the scan differently. The CLI side (wasmtime) and browser side (V8) may have different heap-layout assumptions the scan relies on.

  2. Header / size mismatch in the JS alloc wrapper — the WAT $alloc stores size << 1 at header. If the JS-side alloc wrapper's view of object size differs (e.g. truncation, alignment), the scan might iterate the wrong number of bytes when tracing the bucket.

  3. JS-side register_wrapper timing — the WAT-emitted _emit_wrap_handle calls register_wrapper BEFORE attach_bucket_to_wrapper. If anything reorders or the wrap-table compaction reads stale entries, wrapper2 could be evicted prematurely.

  4. Subtle interaction with Uint8Array.fill(0) over a region currently being traced — the zero-fill happens AFTER the bucket alloc but BEFORE iterating, in the same JS host call. If a sub-GC could fire during the fill (unlikely but worth checking), state could be inconsistent.

Tests skipped in PR #707

Three tests in tests/test_browser.py::TestBrowserMapHostStoreGCReachability695:

  • test_eager_gc_set_of_json_browser
  • test_eager_gc_json_object_with_array_child_browser
  • test_eager_gc_map_of_json_user_level_browser

All three currently @pytest.mark.skip with a reason pointing at this issue. Removing the skip after this is fixed is the "regression test passed → fix landed" handshake — same pattern as the CLI side.

Why not blocking PR #707

PR #707 closes #695 and #705 on the CLI side. The CLI-side tests for those bugs pass. The browser-side reproducer of the same bug class is gated by @pytest.mark.skip. Closing PR #707 without fixing this leaves the browser-side mirror approach with a known gap — but the alternative is gating the whole PR (including the CLI fix that addresses both #695 and #705) on a deep browser-runtime investigation that may need multiple sessions to land.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions