Skip to content

Emit iterative WASM for array_fold — closes #480#489

Merged
aallan merged 3 commits into
mainfrom
iterative-array-fold
Apr 17, 2026
Merged

Emit iterative WASM for array_fold — closes #480#489
aallan merged 3 commits into
mainfrom
iterative-array-fold

Conversation

@aallan

@aallan aallan commented Apr 17, 2026

Copy link
Copy Markdown
Owner

Summary

Last of three. Replaces the recursive prelude array_fold / array_fold_go with a single iterative WAT loop running closure (env, U, T) -> U over each element and updating a running accumulator.

What's different from PRs 1 and 2:

  • No output allocation. Fold returns scalar U, not Array<U> — no dst, no write_idx.
  • Two-param closure signature. The acc + elem both come into the call site; pair-typed U doubles the parameter count.
  • In-place shadow-stack root for pair/ADT accumulators. The running pointer must stay rooted across closure-invocation GC cycles. Rather than push/pop each iteration, we push once up front and overwrite the slot via global.get $gc_sp; i32.const 8; i32.sub; local.get acc_ptr; i32.store. Keeps gc_sp stable, keeps exactly one root alive.

Prelude cleanup bonus: decouples _ARRAY_TYPE_ALIASES injection from the (now empty) array_fn_names set. The coupling CodeRabbit flagged on #485 became the default once all three combinators migrated; now any user code referencing ArrayMapFn / ArrayFilterFn / ArrayFoldFn still works.

Closes #480.

Regression tests (5 new)

  • test_array_fold_large_input_no_stack_overflow — 10K elements, sum 0..9999 = 49,995,000
  • test_array_fold_empty_input — closure never invoked, acc stays at init
  • test_array_fold_pair_accumulator_gc_safetyString concat fold, exercises the in-place shadow-slot overwrite
  • test_array_fold_closure_captures_outer_variable — free-var lift through a two-param closure
  • test_array_fold_of_array_map — nested, exercises _infer_concat_elem_type on array_map (same path PR 2 fixed for filter-of-map)

Test plan

  • mypy vera/ clean (57 files)
  • pytest tests/ -q — 3,327 passed, 11 skipped
  • 34 TestArrayOperations tests pass (29 from before + 5 new fold regressions; existing 4 array_fold basic tests now exercising the iterative path)
  • Pre-commit hooks green

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Performance Improvements

    • Array combinators (map, filter, fold) now use optimised iterative implementations reducing memory overhead to constant-space in typical cases.
  • Behaviour/GC

    • Fold keeps accumulator rooted efficiently to improve stability for complex accumulator types.
  • Documentation

    • Roadmap, changelog and testing docs updated to reflect increased test counts and design notes.
  • Tests

    • Five new fold-focused tests added; total tests now 3,338.

Last of three (#480 PR 3).  array_fold<T, U>(arr, init, fn) is now
emitted as a single WAT loop running closure (env, U, T) -> U over
each element.  Structurally different from PRs 1 and 2:

- No output allocation.  Fold returns a scalar U, not Array<U>, so
  just a running accumulator — no dst, no write_idx.
- Closure sig has two value params, not one.
- For pair-typed or ADT accumulators (e.g. String fold-concat,
  Map-building fold), the running pointer must stay rooted across
  closure-invocation GC cycles.  Solution: shadow-push acc_ptr
  once before the loop, then OVERWRITE the slot in-place each
  iteration with the new pointer from the closure result.  WAT:

    global.get $gc_sp
    i32.const 8
    i32.sub
    local.get $acc_ptr
    i32.store

  Keeps gc_sp stable across iterations (no per-iteration push/pop
  churn) and keeps exactly one acc root on the shadow stack.
  Scalar accumulators (Int, Nat, Float64, Bool, Byte) skip the
  rooting entirely.

Prelude cleanup: removed array_fold / array_fold_go bodies and
emptied array_fn_names.  Decoupled _ARRAY_TYPE_ALIASES injection
from array_fn_names (CodeRabbit's original concern from PR 485 now
applies — with fn-names empty, aliases were never injected under
the old coupling).  User code that references ArrayMapFn /
ArrayFilterFn / ArrayFoldFn in type annotations still works.

Five new regression tests in TestArrayOperations:

- test_array_fold_large_input_no_stack_overflow — 10K elements,
  sum 0..9999 = 49,995,000
- test_array_fold_empty_input — closure never invoked, acc stays
  at init
- test_array_fold_pair_accumulator_gc_safety — String concat fold,
  exercises the in-place shadow-slot overwrite path
- test_array_fold_closure_captures_outer_variable — free-var
  lift through to two-param closure
- test_array_fold_of_array_map — nested; _infer_concat_elem_type
  sees an array_map call (same path the PR 2 review fixed for
  filter-of-map)

Co-Authored-By: Claude <noreply@anthropic.invalid>
@coderabbitai

coderabbitai Bot commented Apr 17, 2026

Copy link
Copy Markdown

Caution

Review failed

Pull request was closed or merged during review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f2936757-d7da-4567-b4d7-5e8a8a6a6e4c

📥 Commits

Reviewing files that changed from the base of the PR and between e9b4b15 and b221c00.

📒 Files selected for processing (4)
  • ROADMAP.md
  • TESTING.md
  • tests/test_prelude.py
  • vera/prelude.py

📝 Walkthrough

Walkthrough

This PR migrates array_fold (alongside previously-implemented array_map and array_filter) from recursive Vera prelude functions to iterative WASM loops, achieving O(1) shadow-stack usage. Implementation includes closure dispatch via call_indirect, dynamic closure signature registration for scalar and pair-typed accumulators, and per-iteration GC rooting via shadow-stack slot overwrites.

Changes

Cohort / File(s) Summary
Documentation
CHANGELOG.md, ROADMAP.md, TESTING.md
Bumped reported test counts (3,333 → 3,338) and documented iterative WASM codegen for array_map, array_filter, and array_fold, noting O(1) shadow-stack behaviour and closure-rooting strategy.
Prelude removal / injection
vera/prelude.py
Removed prelude bodies for array_fold (and ensured array_map/array_filter remain codegen-emitted); switched array_fn_names to an empty set so type aliases (_ARRAY_TYPE_ALIASES) still inject when needed while combinator bodies are no-ops.
WASM code generation
vera/wasm/calls_arrays.py
Added _infer_fold_init_vera_type() and _translate_array_fold() implementing iterative array_fold<T,U> with single WAT loop, dynamic closure signature registration for scalar/pair-typed accumulators, per-iteration call_indirect, and in-place accumulator rooting via shadow-slot overwrite.
Call dispatch & cross-module handling
vera/wasm/calls.py, vera/codegen/modules.py
Added array_fold dispatch in _translate_call() and expanded the known built-in combinators set to include array_fold so cross-module scanning treats it as handled.
Tests
tests/test_codegen_monomorphize.py, tests/test_prelude.py
Added five array_fold-focused tests (large input 10k, empty input, pair/ADT GC safety via @String, closure capture, nested array_fold(array_map(...), ...)); updated prelude tests to assert no combinator bodies are injected and _ARRAY_FN_NAMES is empty.
Small module glue
other small edits
Minor comment/docstring updates and bookkeeping lines across files to reflect the new implementation and test counts.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as WASM Loop
    participant Env as Closure Environment
    participant Closure as User Closure
    participant ShadowStack as Shadow Stack (GC Root)
    
    Note over Caller,ShadowStack: array_fold<T,U>(arr, init, fn) → U
    
    Caller->>Caller: init_local ← init_arg
    Caller->>ShadowStack: push(arr_ptr) [root input]
    Caller->>ShadowStack: push(fn_handle) [root function]
    
    loop For each element in arr
        Caller->>Caller: load arr[idx] → locals
        Caller->>Env: pass (env, acc, elem)
        Caller->>Closure: call_indirect fn(env, acc, elem)
        Closure->>Closure: compute new accumulator
        Closure-->>Caller: return new_acc
        
        alt acc is pair-typed or ADT handle
            Caller->>ShadowStack: overwrite shadow[+2] ← new_acc_ptr
            Note over ShadowStack: single slot reused per iteration
        else acc is scalar
            Note over Caller: no per-iteration push/pop
        end
        
        Caller->>Caller: acc_local ← new_acc
    end
    
    Caller->>ShadowStack: pop(fn_handle, arr_ptr)
    Caller-->>Caller: return acc [scalar or (ptr, len)]
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related issues

  • #490: The PR's adjustments to array_fold GC-rooting heuristics and accumulator rooting closely match the issue's objective to tighten is_gc_managed checks and avoid over-rooting host-managed handles.

Possibly related PRs

  • #485: Iterative array_map implementation — shares the same iterative-WASM pattern and prelude/injection changes applied here.
  • #486: Iterative array_filter implementation — parallel changes to prelude injection, dispatch and codegen for array combinators.
  • #454: Closure ABI / i32_pair param-return fixes — directly relevant to call_indirect signature expansion and pair-typed accumulator handling in this PR.

Suggested labels

tests, compiler, docs

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and concisely describes the primary change: emitting iterative WASM for array_fold, which aligns with the main objective of replacing recursive prelude implementations with direct WASM loops.
Linked Issues check ✅ Passed The PR addresses issue #480 by implementing iterative WASM loops for array_fold, maintaining O(1) shadow-stack space, supporting GC rooting via in-place slot overwrite, and adding comprehensive regression tests covering large inputs, empty inputs, pair accumulators, closure capture, and nested operations.
Out of Scope Changes check ✅ Passed All changes are directly scoped to issue #480: iterative array_fold codegen, prelude decoupling (type aliases vs. bodies), cross-module call handling, test coverage, and documentation updates reflecting the structural shift. One follow-up item (tighter ADT-rooting heuristic) was appropriately deferred to issue #490.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch iterative-array-fold

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Apr 17, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 88.57143% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.27%. Comparing base (a9ed615) to head (b221c00).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
vera/wasm/calls_arrays.py 87.96% 16 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #489      +/-   ##
==========================================
- Coverage   90.40%   90.27%   -0.14%     
==========================================
  Files          58       58              
  Lines       19845    19980     +135     
  Branches      225      225              
==========================================
+ Hits        17941    18037      +96     
- Misses       1900     1939      +39     
  Partials        4        4              
Flag Coverage Δ
javascript 50.67% <ø> (ø)
python 95.07% <88.57%> (-0.19%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_prelude.py`:
- Line 129: Remove the vacuous assertion that "_ARRAY_FN_NAMES.issubset(names)"
in tests/test_prelude.py: delete the assertion line (or replace it with a
meaningful check) since _ARRAY_FN_NAMES is now empty and the explicit
forbidden-name checks already cover the regression; ensure no other tests rely
on this assertion and run the test suite to confirm nothing else needs updating.

In `@vera/prelude.py`:
- Around line 523-535: Update the inject_prelude docstring to reflect the new
array-combinator behaviour: state that when _ARRAY_COMBINATORS is empty the
prelude will only inject type aliases (via _ARRAY_TYPE_ALIASES) for
ArrayMapFn/ArrayFilterFn/ArrayFoldFn and will not inject the combinator function
bodies, and note that function bodies will only be injected when
_ARRAY_COMBINATORS is populated and array_fn_names is not a subset of
user_names; remove or change any phrasing that claims array_fold is recursively
injected so the docstring matches the runtime checks around array_fn_names and
_ARRAY_COMBINATORS in inject_prelude.

In `@vera/wasm/calls_arrays.py`:
- Around line 1064-1070: The current code rejects folds when the accumulator is
Unit because _element_wasm_type("Unit") returns None; update the branch around
_infer_fold_init_vera_type / _is_pair_element_type / _element_wasm_type so that
if u_type is the Unit type you do NOT return None but instead treat the
accumulator as erased: remove the accumulator parameter/result from the
generated call_indirect signature and skip creating locals/root bookkeeping for
U (i.e. avoid allocating or tracking the U local and omit it from pair handling
in _is_pair_element_type paths), while keeping the rest of the fold wiring
intact.
- Around line 1072-1082: The current logic in the accumulator rooting heuristic
(variables u_is_adt and u_needs_root) treats any non-Bool/Byte i32 as a GC
pointer; change it to only root when the type is positively known to be a
GC-managed heap type instead of relying on u_wasm == "i32". Replace the u_is_adt
condition with a call to a predicate that detects GC-managed heap types (e.g.
is_gc_managed(u_type) or a similar helper based on _element_wasm_type) and then
set u_needs_root = u_is_pair or <that_gc_managed_predicate>; ensure array_fold
and any uses of u_needs_root only root when the predicate returns true, so
opaque host handles (Map/Set/Json/Decimal) that also lower to i32 are not
shadow-pushed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0ce7a0a8-e9ff-43ca-a895-a14832cf17a5

📥 Commits

Reviewing files that changed from the base of the PR and between a9ed615 and e9b4b15.

📒 Files selected for processing (9)
  • CHANGELOG.md
  • ROADMAP.md
  • TESTING.md
  • tests/test_codegen_monomorphize.py
  • tests/test_prelude.py
  • vera/codegen/modules.py
  • vera/prelude.py
  • vera/wasm/calls.py
  • vera/wasm/calls_arrays.py

Comment thread tests/test_prelude.py Outdated
Comment thread vera/prelude.py
Comment thread vera/wasm/calls_arrays.py
Comment thread vera/wasm/calls_arrays.py
aallan and others added 2 commits April 17, 2026 18:20
Two small fixes; two findings declined with reasoning.

Fixed:
- test_prelude.py: removed the `_ARRAY_FN_NAMES.issubset(names)`
  assertion that was always trivially true (set is empty); the
  explicit forbidden-name loop already covers the regression.
- prelude.py: updated `inject_prelude`'s docstring to reflect the
  new array behavior — aliases always injected (decoupled), bodies
  only when `_ARRAY_COMBINATORS` is non-empty and fn names aren't
  shadowed.

Declined:
- Unit-accumulator support in `_translate_array_fold`: verified
  not reachable by current grammar.  Vera has no Unit literal
  syntax (`unit` rejected as E005; `()` produces an unrelated WAT
  compilation error).  No test or example can trigger the path.
  When first-class Unit values land (if ever), this is easy to
  add.
- Tighter ADT-rooting heuristic: the current code over-roots
  host-managed handles (Map, Set, Decimal, Regex — small host-side
  ints, not Vera-heap pointers).  This is safe over-caution — the
  conservative GC's alignment/range check rejects out-of-range
  values, so rooting an int like 5 doesn't mark anything spurious.
  Under-rooting would be a silent-GC bug (same class as #464), so
  the current bias is the correct safety default.  A cleaner
  type-category predicate distinguishing GC-managed from host-
  managed types is a small but cross-cutting refactor worth its
  own PR — filed as #490.

Co-Authored-By: Claude <noreply@anthropic.invalid>
Follow-up from #489 review. Files the over-rooting of host-managed
handles in iterative combinators under Continuous / Compiler
internals — it's a small hygiene refactor, not a feature milestone,
and not a bug so it doesn't belong in KNOWN_ISSUES.

Co-Authored-By: Claude <noreply@anthropic.invalid>
@aallan aallan merged commit 62abd51 into main Apr 17, 2026
18 checks passed
@aallan aallan deleted the iterative-array-fold branch April 17, 2026 17:29
aallan added a commit that referenced this pull request Apr 17, 2026
Three new operations on the built-in IO effect:

- IO.sleep(@nat) -> Unit            # pause for N milliseconds
- IO.time(@Unit) -> @nat            # current Unix time in ms
- IO.stderr(@string) -> Unit        # write to stderr

Implementation trail:

- environment.py: three OpInfo entries on the IO EffectInfo.
- codegen/api.py: host_sleep (time.sleep), host_time (time.time *
  1000, i64), host_stderr (writes to sys.stderr or optional capture
  buffer).
- ExecuteResult gains a `stderr: str` field (empty by default to
  preserve the pre-#463 shape).  execute() gains a `capture_stderr:
  bool = False` parameter so tests can opt in without leaking to the
  real stderr.
- codegen/assembly.py: three _IO_IMPORTS entries (sleep param i64,
  time result i64, stderr param i32 i32).
- wasm/context.py: _is_void_expr now treats IO.sleep and IO.stderr
  as void like IO.print, so trailing statements don't emit a
  spurious `drop`.
- browser/runtime.mjs: hostSleep busy-waits on performance.now
  (Atomics.wait unavailable on main thread; long sleeps block
  rendering), hostTime = Date.now() as BigInt, hostStderr captures
  to stderrBuf with getStderr()/clearStderr() accessors.  Reset
  path includes stderrBuf.
- browser/harness.mjs: emits `stderr` alongside `stdout` in the JSON
  output so test_browser.py can assert on captured stderr.

Tests:

- tests/conformance/ch07_io_time_stderr.vera — exercises time +
  stderr together; level=run.  (Sleep excluded — wall-clock timing
  is inherently flaky; covered by unit tests instead.)
- tests/test_codegen.py: 5 new tests (time returns positive Nat,
  sleep completes, sleep(0) no-op, stderr captured when requested,
  stderr empty by default).
- tests/test_browser.py: 3 new tests (stderr capture, time past
  epoch, sleep completes).

Documentation:

- SKILL.md: IO table updated from 7 ops to 10; browser-runtime
  paragraph extended to cover the new ops.
- spec/12-runtime.md: three new host-import rows, three new
  §12.4.1.x subsections (sleep/time/stderr), browser comparison
  table extended.
- CHANGELOG.md: v0.0.114 section with the full write-up.
- Doc-count sweep: 73 → 74 conformance, 3,338 → 3,351 tests,
  test_browser/test_codegen/test_codegen_monomorphize line counts,
  test count references in README/FAQ/AGENTS/TESTING/ROADMAP.

Also bundled in this PR (ROADMAP → HISTORY shuffle for #480, which
should have been in #489 but was missed):

- HISTORY.md gains three Stage 11 entries: #484 widen header, #480
  iterative combinators, and the v0.0.114 #463 entry.
- ROADMAP.md removes #480 row from "What's next", removes the #463
  standalone bullet from Phase 4a, adjusts the reshape paragraph
  and capability-expansion examples list to match.

Version bump: v0.0.113 → v0.0.114 in pyproject.toml,
vera/__init__.py, docs/index.html, README.md.

Co-Authored-By: Claude <noreply@anthropic.invalid>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reimplement higher-order array ops as iterative WASM loops

1 participant