Skip to content

Add array utility built-ins (sort, reverse, contains, find, any/all) #466

@aallan

Description

@aallan

Summary

Vera's array built-ins cover the basics (length, append, concat, slice, map, filter, fold) but are missing several high-frequency operations that every program needs. Notably, there is no way to sort an array, and no way to map with index — a gap that forces hand-written recursive accumulators with multi-parameter, same-typed slots (the #1 De Bruijn failure pattern for LLMs per VeraBench).

Proposed API

Indexed variants

  • array_mapi<A, B>(@Array<A>, @Fn(@A, @Nat -> @B) effects(pure)) -> @Array<B> — map with zero-based index. The callback receives the element and its position. Follows array_map's signature with one extra @Nat parameter; matches OCaml List.mapi, Rust iter().enumerate().map(), JavaScript arr.map((x, i) => ...), Python enumerate. Collapses the recursive-accumulator-with-index pattern (the one that caused Type-checked pure program produces incorrect runtime output that isolated probes cannot reproduce #464 during Conway's Game of Life development) into a single call.

Transform functions

  • array_reverse<T>(@Array<T>) -> @Array<T> — reverse element order
  • array_sort<T>(@Array<T>) -> @Array<T> where Ord<T> — sort using the Ord ability
  • array_sort_by<T>(@Array<T>, @Fn(@T, @T -> @Ordering)) -> @Array<T> — sort with custom comparator
  • array_zip<A, B>(@Array<A>, @Array<B>) -> @Array<(A, B)> — pair elements (truncates to shorter) — deferred until tuple types exist
  • array_flatten<T>(@Array<Array<T>>) -> @Array<T> — flatten one level of nesting

Query functions

  • array_contains<T>(@Array<T>, @T) -> @Bool where Eq<T> — membership test
  • array_index_of<T>(@Array<T>, @T) -> @Option<Nat> where Eq<T> — find position
  • array_find<T>(@Array<T>, @Fn(@T -> @Bool)) -> @Option<T> — first match
  • array_any<T>(@Array<T>, @Fn(@T -> @Bool)) -> @Bool — existential predicate
  • array_all<T>(@Array<T>, @Fn(@T -> @Bool)) -> @Bool — universal predicate

Implementation

  • environment.py: Register each function with appropriate type signatures and ability constraints.
  • codegen/api.py: Host imports for sort (Python sorted()), reverse (list(reversed(...))), and the remaining ops.
  • wasm/calls.py: Wire calls to host imports.
  • Prefer iterative WASM / host imports over prelude recursion. The existing array_map / array_filter / array_fold in prelude.py are recursive Vera functions, so a 450-element input allocates 450 shadow stack frames. The new Type-checked pure program produces incorrect runtime output that isolated probes cannot reproduce #464 overflow guard catches this at ~1,365 frames, but a nested array_mapi inside another recursive op can still hit the ceiling. Implementing the new ops (and ideally retrofitting the existing ones) as iterative WASM loops or host imports avoids the class of shadow-stack concerns entirely — tracked as a follow-up refactor.
  • Verification: array_sort / array_reverse / array_mapi postcondition: result has the same length as input. array_contains / array_any / array_all are boolean — standard Z3 encoding. array_index_of / array_find return Option<T> — ensure None when no match.

Priority

Highest-impact additions, in order:

  1. array_sort — every non-trivial program needs sorting; currently requires a hand-written merge sort with decreases reasoning.
  2. array_mapi — collapses the recursive-accumulator-with-index pattern that's the dominant LLM failure mode. Concrete evidence: the Conway's Game of Life program that triggered Type-checked pure program produces incorrect runtime output that isolated probes cannot reproduce #464 would have been ~20 lines shorter and would not have hit the GC bug if array_mapi existed.
  3. array_contains — membership currently requires array_length(array_filter(...)) > 0.
  4. array_reverse — same recursive-accumulator pattern as array_mapi.

The remaining transform and query functions (array_sort_by, array_flatten, array_index_of, array_find, array_any, array_all) are useful but lower-frequency.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions