Summary
Both html_parse and json_parse trap with Out-of-bounds memory access on inputs large enough to pressure GC during host-side tree marshalling, escaping their declared Result<HtmlNode, String> / Result<Json, String> contracts. Confirmed root cause: missing shadow-stack rooting in vera.wasm.html_serde.write_html and vera.wasm.json_serde.write_json — the same bug class as #570 (array_map), #515 (fold accumulator), and #593 (closure-return shadow asymmetry), but on the host side rather than WAT-emitted user code.
External report from an agent building a summarise_urls example surfaced the html_parse trap. The agent's bisection ruled out a pure size threshold — same byte count triggers or doesn't depending on what's at both ends of the window — which is exactly what GC-pressure-driven corruption looks like (it depends on what got allocated before the host walker ran).
The parallel write_json bug is proven, not claimed — reproducer and trap output below.
Confirmed reproducers
html_parse (original report)
effect IO { op print(String -> Unit); }
public fn main(@Unit -> @Unit)
requires(true) ensures(true) effects(<IO>)
{
let @Result<String, String> = IO.read_file("/path/to/FAQ.md");
match @Result<String, String>.0 {
Ok(@String) -> {
let @Result<HtmlNode, String> = html_parse(@String.0);
match @Result<HtmlNode, String>.0 {
Ok(@HtmlNode) -> IO.print("Ok\n"),
Err(@String) -> IO.print("Err\n")
}
},
Err(@String) -> IO.print("read failed\n")
}
}
Output:
Error: Out-of-bounds memory access
Source backtrace:
(suppressed 1 runtime-helper frame above first user code)
in main (file.vera:8-22)
Neither Ok nor Err is printed — the trap bypasses both match arms.
json_parse (verified during triage)
Generate a deep nested array (1000 sub-arrays × 50 numbers, ~192 KB input):
import json
data = [list(range(50)) for _ in range(1000)]
text = json.dumps(data)
# embed `text` into a Vera string literal that calls json_parse(@String.0)
effect IO { op print(String -> Unit); }
public fn main(@Unit -> @Unit)
requires(true) ensures(true) effects(<IO>)
{
let @String = "<the 192-KB JSON literal>";
let @Result<Json, String> = json_parse(@String.0);
match @Result<Json, String>.0 {
Ok(@Json) -> IO.print("Ok\n"),
Err(@String) -> IO.print(string_concat("Err: ", @String.0))
}
}
Output:
Error: Unhandled WASM trap: wasm trap: out of bounds memory access
Same root cause, same fix needed.
Root cause (one bug class, two call sites)
The trap is a wasmtime out-of-bounds access at WASM address 0xfffffffd (= 1 - 4 in signed i32) from inside $alloc's free-list traversal:
loop $fl_loop
local.get $node ;; $node = 1 here
i32.eqz
br_if $fl_done
local.get $node
i32.const 4
i32.sub ;; 1 - 4 = 0xFFFFFFFD
i32.load ;; <-- TRAP: address 0xfffffffd out of bounds
The free list contains a corrupt node pointer of value 1. 1 is the tag value for HtmlText (and Result.Err). The corruption sequence:
write_html (or write_json) allocates arr_ptr = alloc(child_count * 4) for an array of child pointers.
- Recursive call walks children, allocating strings + sub-trees. Eventually one of those allocs triggers
$gc_collect (bump-alloc would overflow current memory).
gc_collect scans the WASM shadow stack for live roots. arr_ptr is in a Python local — not on the shadow stack — so GC treats the block as garbage and links it into the free list. The free-list link is written into body[0] of the freed block.
- After GC returns, host code continues: writes a
HtmlText tag (= 1) into a freshly-allocated body that happens to be at the same address as the freed arr_ptr (because the freed block got picked up by the next alloc). The free-list link in body[0] gets overwritten with 1.
- Some later alloc traverses the free list — picks up a block whose
body[0] (i.e. node.next) is 1. Sets $gc_free_head = 1.
- The next alloc reads
$node = 1, tries i32.load (node - 4) = read at 0xFFFFFFFD. Trap.
The exact same vulnerable shape lives in both files:
# vera/wasm/html_serde.py::write_html — element branch
arr_ptr = alloc(caller, child_count * 4) # <- not rooted
for i, child in enumerate(children):
child_ptr = write_html(...) # <- may GC and free arr_ptr
write_i32(caller, arr_ptr + i * 4, child_ptr) # <- writes to freed block
# vera/wasm/json_serde.py::write_json — JArray branch
arr_ptr = alloc(caller, count * 4) # <- not rooted
for i, elem in enumerate(value):
elem_ptr = write_json(...) # <- may GC and free arr_ptr
write_i32(caller, arr_ptr + i * 4, elem_ptr) # <- writes to freed block
name_ptr / wrapper_ptr (in write_html) and intermediate value pointers stored into map_dict (in write_json's JObject branch) have the same exposure.
Empirical confirmation
Patching the host so memory is pre-grown by 256 pages (16 MB) before entering write_html makes the html_parse trap disappear and returns Ok as expected:
mem = caller["memory"]
if mem.size(caller) < 250:
mem.grow(caller, 256)
This proves the bug is GC-pressure-dependent: when no alloc during the walk actually needs to grow memory, $gc_collect never fires, no stale Python-held pointers get reclaimed, and the walk completes correctly. The same fix works for write_json by symmetry — same pattern, same workaround applies.
Scope
This affects every host walker that allocates more than one WASM object across Python-held pointers:
vera/wasm/html_serde.py::write_html — confirmed
vera/wasm/json_serde.py::write_json — confirmed
- Any future host import that builds a complex ADT — same trap waiting
Fix options (sorted by effort)
Option 1 — pre-grow memory in host walkers (small, surgical). Have host_html_parse and host_json_parse estimate an upper bound on tree size from the input length, call caller["memory"].grow(...) to ensure enough headroom, then walk normally. Estimate: 2 * len(input) + 65536 bytes is comfortably conservative for both formats. No WAT changes, no new exports. Cons: over-grows memory for small inputs; doesn't fix the architectural issue; still vulnerable if a future walker forgets to do this.
Option 2 — add $gc_disabled flag + host helpers (medium). Add a (global $gc_disabled (mut i32) (i32.const 0)), modify $alloc to skip gc_collect when set (go straight to memory.grow), and export setter/getter helpers. Host walkers set it before walking, clear after. Cons: requires WAT changes; clearing-on-exception is easy to forget.
Option 3 — export $gc_sp + $gc_stack_limit, push intermediate pointers from the host (clean, larger). Architecturally correct: host walkers push their arr_ptr / name_ptr / wrapper_ptr onto the shadow stack before the next alloc. GC sees them as live. Cons: requires WAT exports + host helpers + careful push/pop discipline in every walker; the abstraction needs to handle exception unwinding.
Recommendation: ship Option 1 immediately to close both reported call sites. File Option 3 as a follow-up for the architectural fix that prevents the next host walker from inheriting the bug.
Test that pins the fix
Two tests/conformance/ entries:
html_parse on a checked-in copy of the current FAQ body — asserts Ok (or a well-formed Err).
json_parse on a generated nested-JSON fixture sized to trigger memory growth — asserts Ok.
Either Ok or a well-formed Err is contractually acceptable — a trap is not.
Workaround for users today
- Per the original reporter's note: removing the
html_parse call works if the input was always going to be plain text anyway.
- For genuinely-HTML or genuinely-JSON inputs there is no Vera-level workaround — the pre-flight
string_slice size cap the reporter considered would have caught their specific FAQ case but is unsound as a general fix because the bisection shows length alone isn't the trigger.
Files
vera/wasm/html_serde.py::write_html — primary site (confirmed)
vera/wasm/json_serde.py::write_json — same bug class, same fix needed (confirmed)
vera/codegen/api.py::host_html_parse / host_json_parse — would gain the pre-grow call (Option 1)
vera/codegen/assembly.py::_emit_alloc — would gain the $gc_disabled check (Option 2)
Original report
The reporter's full write-up is reproduced below.
Full bug report (click to expand)
[report verbatim]
Summary
Both
html_parseandjson_parsetrap withOut-of-bounds memory accesson inputs large enough to pressure GC during host-side tree marshalling, escaping their declaredResult<HtmlNode, String>/Result<Json, String>contracts. Confirmed root cause: missing shadow-stack rooting invera.wasm.html_serde.write_htmlandvera.wasm.json_serde.write_json— the same bug class as #570 (array_map), #515 (fold accumulator), and #593 (closure-return shadow asymmetry), but on the host side rather than WAT-emitted user code.External report from an agent building a
summarise_urlsexample surfaced thehtml_parsetrap. The agent's bisection ruled out a pure size threshold — same byte count triggers or doesn't depending on what's at both ends of the window — which is exactly what GC-pressure-driven corruption looks like (it depends on what got allocated before the host walker ran).The parallel write_json bug is proven, not claimed — reproducer and trap output below.
Confirmed reproducers
html_parse(original report)Output:
Neither
OknorErris printed — the trap bypasses both match arms.json_parse(verified during triage)Generate a deep nested array (1000 sub-arrays × 50 numbers, ~192 KB input):
Output:
Same root cause, same fix needed.
Root cause (one bug class, two call sites)
The trap is a wasmtime out-of-bounds access at WASM address
0xfffffffd(=1 - 4in signed i32) from inside$alloc's free-list traversal:The free list contains a corrupt node pointer of value
1.1is the tag value forHtmlText(andResult.Err). The corruption sequence:write_html(orwrite_json) allocatesarr_ptr = alloc(child_count * 4)for an array of child pointers.$gc_collect(bump-alloc would overflow current memory).gc_collectscans the WASM shadow stack for live roots.arr_ptris in a Python local — not on the shadow stack — so GC treats the block as garbage and links it into the free list. The free-list link is written intobody[0]of the freed block.HtmlTexttag (= 1) into a freshly-allocated body that happens to be at the same address as the freedarr_ptr(because the freed block got picked up by the next alloc). The free-list link inbody[0]gets overwritten with1.body[0](i.e.node.next) is1. Sets$gc_free_head = 1.$node = 1, triesi32.load (node - 4)= read at0xFFFFFFFD. Trap.The exact same vulnerable shape lives in both files:
name_ptr/wrapper_ptr(inwrite_html) and intermediate value pointers stored intomap_dict(inwrite_json's JObject branch) have the same exposure.Empirical confirmation
Patching the host so memory is pre-grown by 256 pages (16 MB) before entering
write_htmlmakes thehtml_parsetrap disappear and returnsOkas expected:This proves the bug is GC-pressure-dependent: when no
allocduring the walk actually needs to grow memory,$gc_collectnever fires, no stale Python-held pointers get reclaimed, and the walk completes correctly. The same fix works forwrite_jsonby symmetry — same pattern, same workaround applies.Scope
This affects every host walker that allocates more than one WASM object across Python-held pointers:
vera/wasm/html_serde.py::write_html— confirmedvera/wasm/json_serde.py::write_json— confirmedFix options (sorted by effort)
Option 1 — pre-grow memory in host walkers (small, surgical). Have
host_html_parseandhost_json_parseestimate an upper bound on tree size from the input length, callcaller["memory"].grow(...)to ensure enough headroom, then walk normally. Estimate:2 * len(input) + 65536bytes is comfortably conservative for both formats. No WAT changes, no new exports. Cons: over-grows memory for small inputs; doesn't fix the architectural issue; still vulnerable if a future walker forgets to do this.Option 2 — add
$gc_disabledflag + host helpers (medium). Add a(global $gc_disabled (mut i32) (i32.const 0)), modify$allocto skipgc_collectwhen set (go straight tomemory.grow), and export setter/getter helpers. Host walkers set it before walking, clear after. Cons: requires WAT changes; clearing-on-exception is easy to forget.Option 3 — export
$gc_sp+$gc_stack_limit, push intermediate pointers from the host (clean, larger). Architecturally correct: host walkers push theirarr_ptr/name_ptr/wrapper_ptronto the shadow stack before the next alloc. GC sees them as live. Cons: requires WAT exports + host helpers + careful push/pop discipline in every walker; the abstraction needs to handle exception unwinding.Recommendation: ship Option 1 immediately to close both reported call sites. File Option 3 as a follow-up for the architectural fix that prevents the next host walker from inheriting the bug.
Test that pins the fix
Two
tests/conformance/entries:html_parseon a checked-in copy of the current FAQ body — assertsOk(or a well-formedErr).json_parseon a generated nested-JSON fixture sized to trigger memory growth — assertsOk.Either
Okor a well-formedErris contractually acceptable — a trap is not.Workaround for users today
html_parsecall works if the input was always going to be plain text anyway.string_slicesize cap the reporter considered would have caught their specific FAQ case but is unsound as a general fix because the bisection shows length alone isn't the trigger.Files
vera/wasm/html_serde.py::write_html— primary site (confirmed)vera/wasm/json_serde.py::write_json— same bug class, same fix needed (confirmed)vera/codegen/api.py::host_html_parse/host_json_parse— would gain the pre-grow call (Option 1)vera/codegen/assembly.py::_emit_alloc— would gain the$gc_disabledcheck (Option 2)Original report
The reporter's full write-up is reproduced below.
Full bug report (click to expand)
[report verbatim]