Skip to content

Indexing a captured Array<T> inside a closure body produces invalid WASM #588

@aallan

Description

@aallan

Summary

Indexing a captured Array<T> inside a closure body produces invalid WASM. The element type doesn't matter — Array<Int>, Array<Bool>, Array<Array<Bool>> all reproduce. The closure-body context is the trigger; the same indexing outside a closure works fine.

Two distinct failure modes depending on nesting depth:

  • Flat case (one level): WASM module fails validation — the binary is structurally invalid and wasmtime rejects it before the first instruction runs. Error: unknown table 0: table index out of bounds.
  • Nested case (two levels): module passes validation but traps at runtime. Error: wasm trap: undefined element: out of bounds table access or sometimes indirect call type mismatch (variant depends on what occupies the table slot the codegen lands on).

vera check accepts both shapes — the failure is purely in codegen.

Discovered via two independent paths:

  1. A Claude.ai sandbox writing Conway's Game of Life produced a bug report with the minimum reproducer and narrowing matrix.
  2. The conformance-test-writer agent hit it while writing the apply_fn(closure, ()) on (Unit -> X) closure trips WASM type mismatch (phantom i64 param in call_indirect sig) #586 apply_fn conformance test and worked around it by using array_length instead.

Reproducers

Flat case (validation failure)

private fn step(@Array<Int> -> @Array<Int>)
  requires(true) ensures(true) effects(pure)
{
  array_map(@Array<Int>.0, fn(@Int -> @Int) effects(pure) {
    @Array<Int>.0[0]
  })
}

public fn main(@Unit -> @Unit)
  requires(true) ensures(true) effects(<IO>)
{
  let @Array<Int> = step([10, 20, 30]);
  IO.print("done\n")
}

vera check → OK
vera run:

Error: Unhandled WASM trap: failed to compile: wasm[0]::function[3]::step
Caused by:
    1: Invalid input WebAssembly code at offset 1094:
       unknown table 0: table index out of bounds

Nested case (runtime trap)

private fn step(@Array<Array<Bool>> -> @Array<Array<Bool>>)
  requires(true) ensures(true) effects(pure)
{
  array_mapi(@Array<Array<Bool>>.0, fn(@Array<Bool>, @Nat -> @Array<Bool>) effects(pure) {
    array_mapi(@Array<Bool>.0, fn(@Bool, @Nat -> @Bool) effects(pure) {
      @Array<Array<Bool>>.0[@Nat.1][@Nat.0]
    })
  })
}
wasm trap: undefined element: out of bounds table access

Narrowing matrix

Shape Result
array_length(@Array<T>.0) on captured array works
Closure-local @Array<T> parameter, indexed works
Captured @Array<T>.0[i] (flat) fails validation
Captured @Array<Array<T>>.0[i][j] (nested) fails at runtime
let @Array<T> = @Array<T>.0; @Array<T>.0[i] (flat case) works (rebind workaround)
let @Array<T> = @Array<T>.0; ... let-bound[i] (nested case) still fails
Same indexing outside any closure works

The element type is not the trigger — both Array<Int> and Array<Bool> fail in the same shape. The trigger is the indexing operator on a slot reference whose binding is a capture, when that indexing happens inside a closure body.

Hypothesis

The [] operator is lowered to a typed-load operation whose function-table slot or load-op selection depends on the operand's resolved type. When the operand is a captured Array, the post-lifted operand has a slightly different shape than a "real" Array slot in the closure-local scope, and the load/dispatch path computes a wrong table index.

The flat-case validation-failure vs nested-case runtime-trap difference is plausibly the same mechanism landing at different distances off the correct table index — close enough to be in-bounds (runtime trap on a wrong-typed function) for the nested case, far enough to be out-of-bounds (validation failure) for the flat case.

Likely sites to inspect:

  • vera/wasm/data.py::_translate_index_expr — the index-expression lowering. Check whether the collection's type info comes from _infer_index_element_type_expr, which special-cases SlotRef with coll.type_name == "Array" (recently extended for Type aliases over Array<T> break WASM codegen (parameter case: invalid WAT; let-binding case: silent E602 skip) #583 to follow type aliases). Capture-derived SlotRefs may have a slightly different type_name shape.
  • vera/codegen/closures.py — the closure-lifting pass. Check whether captured-Array slot references are rewritten the same way for [] operands as they are for array_length / array_map operands. The latter work; the former doesn't, so they go through different lowering.

Workarounds

Flat case: let-rebind the captured array

array_map(@Array<Int>.0, fn(@Int -> @Int) effects(pure) {
  let @Array<Int> = @Array<Int>.0;
  @Array<Int>.0[0]
})

Nested case: pass the array through a top-level function

The let-rebind workaround does not suffice. But a top-level helper that takes the array as an explicit parameter does:

private fn read_cell(@Array<Array<Bool>>, @Nat, @Nat -> @Bool)
  requires(true) ensures(true) effects(pure)
{
  let @Array<Bool> = @Array<Array<Bool>>.0[@Nat.1];
  @Array<Bool>.0[@Nat.0]
}

private fn step(@Array<Array<Bool>> -> @Array<Array<Bool>>)
  requires(true) ensures(true) effects(pure)
{
  array_mapi(@Array<Array<Bool>>.0, fn(@Array<Bool>, @Nat -> @Array<Bool>) effects(pure) {
    array_mapi(@Array<Bool>.0, fn(@Bool, @Nat -> @Bool) effects(pure) {
      read_cell(@Array<Array<Bool>>.0, @Nat.1, @Nat.0)
    })
  })
}

In this minimal form it runs cleanly. However, scaling up to a full Conway's Game of Life implementation still produces the trap or — more often — silent corruption where invalid bytes leak into IO.print. The minimum repros do not reproduce that variant. Either there is an additional trigger not yet isolated or there are multiple closely-related codegen bugs.

SKILL.md correction needed

The "Capturing a pair-typed value" section currently claims:

Capturing a String or Array<T> works the same way as capturing any other outer binding — the closure-struct layout serialises both halves of the (ptr, len) pair, so array_length / string_length / element-access all see the captured value correctly inside the closure body.

The "element-access" claim does not hold when element-access is via the [] operator on a slot reference whose binding is a capture. It does hold via array_length and via let-rebound copies (in the flat case).

Acceptance

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions