v0.0.137: fix #588 (captured-Array indexing in closure produces invalid WASM)#594
Conversation
…invalid WASM)
Indexing a *captured* `Array<T>` inside a closure body — `coll[i]`
where `coll` is a SlotRef to an outer-scope binding — produced
invalid WASM. Element type didn't matter: Array<Int>, Array<Bool>,
Array<Array<Bool>> all reproduced. Two failure modes by depth:
- Flat case: WASM module fails *validation* with "unknown table 0:
table index out of bounds" (binary structurally invalid).
- Nested case (two levels): module passes validation but traps at
runtime with "undefined element: out of bounds table access" or
"indirect call type mismatch".
Root cause: `_walk_free_vars` in `vera/wasm/closures.py` had no
`IndexExpr` branch. When the walker hit `coll[idx]` inside a closure
body, the `coll` SlotRef referencing a captured outer slot was never
recognised as a free variable. The closure-lift's `captures` list
came back empty, body translation failed (the SlotRef couldn't be
resolved against an empty capture-only env), and
`_compile_lifted_closure` returned None — but the call site at
`_translate_apply_fn` had already emitted a `call_indirect` to the
now-absent function-table entry.
Fix adds the missing `IndexExpr` branch plus seven other AST node
types that were silently falling through with the same bug class:
- `ArrayLit` (elements may contain captured slots)
- `InterpolatedString` (Expr parts may reference captures)
- `HandleExpr` (with proper handler-clause param scoping and
handler-state @t scoping)
- `AssertExpr` / `AssumeExpr` (their inner predicate)
- `ForallExpr` / `ExistsExpr` (domain expr + predicate AnonFn)
- `ModuleCall` (cross-module call args)
Each silently dropped capture references inside its sub-expressions
the same way IndexExpr did. The fix is mechanical for all eight
branches: walk the sub-expressions with appropriate scope-tracking.
Conformance test ch05_capture_array_index.vera covers the three
documented positions: flat (the minimal IndexExpr-in-walker
reproducer), nested (two-level array indexing through captured
outer), combined (closure body uses captured array as both
IndexExpr and FnCall arg).
Conformance count: 85 -> 86. Total tests 3,740 -> 3,745.
Note: the issue body acknowledged that scaling to a full Conway's
Game of Life implementation may have additional triggers beyond
captured-array indexing. Both documented BUG_REPORT.md reproducers
(repro_min.vera, repro_nested.vera) pass post-fix. The remaining
full-Life corruption (string-output corruption from generation 1+
at 12x30 scale with recursive run_loop) is a separate not-yet-
isolated bug class tracked as #593, masked from the user as a
Python traceback by the v0.0.136 errors="replace" defensive layer
(surfaces as U+FFFD chars in output rather than crashing).
Doc updates:
- CHANGELOG entry under [0.0.137] Fixed
- HISTORY.md Stage 11 row for v0.0.137
- KNOWN_ISSUES.md: drop #588 row (closed); add #593 row (residual)
- All count surfaces (TESTING.md, ROADMAP.md, etc.) bumped
- Version files + uv.lock bumped 0.0.136 -> 0.0.137
Co-Authored-By: Claude <noreply@anthropic.invalid>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #594 +/- ##
==========================================
- Coverage 90.97% 90.87% -0.10%
==========================================
Files 59 59
Lines 22929 22968 +39
Branches 259 259
==========================================
+ Hits 20860 20873 +13
- Misses 2062 2088 +26
Partials 7 7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
📝 WalkthroughWalkthroughVersion 0.0.137 extends closure free-variable traversal to additional expression forms (fixing captured Array indexing inside closures), converts Ctrl‑C during host IO.sleep into an internal _VeraExit(130) with tests, adds a conformance test, and updates release/docs/version/metrics (conformance-suite size 85 → 86). ChangesCaptured Array Indexing Closure Fix
Host Ctrl‑C / Runtime Exit Fix
Documentation and Metric Updates
Sequence Diagram(s)sequenceDiagram
participant Caller as IO.sleep caller
participant Time as time.sleep
participant Vera as host_sleep/_VeraExit
Caller->>Time: time.sleep(ms / 1000.0)
Time-->>Caller: KeyboardInterrupt (Ctrl‑C)
Caller->>Vera: raise _VeraExit(130) from None
Vera-->>Caller: handled as Vera exit result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
…ass) Discovered when the user Ctrl-C'd a Conway's Life animation that uses IO.sleep(120) between frames: KeyboardInterrupt arrived inside time.sleep, propagated up through wasmtime's trampoline as a "python exception" cause, and the user saw a multi-line Python traceback ending in KeyboardInterrupt — same WasmTrapError contract violation class as #589's UnicodeDecodeError escape (#516 / #522 / #547 — runtime traps must surface as Vera-native errors). Fix: host_sleep catches KeyboardInterrupt and raises _VeraExit(130) (conventional SIGINT exit code, 128 + signal-2). The _VeraExit chain unwrap at top of execute() recognises it and returns ExecuteResult with exit_code=130 plus any captured stdout/stderr — clean process exit, no Python traceback. Filed as #595: a follow-on macOS malloc abort can still fire during wasmtime/ctypes cleanup after the clean exit. That's a separate wasmtime-interaction bug, not data-integrity, and fixing it is outside our control unless we can identify what state the program was in at the Ctrl-C moment. Possibly shares a root cause with #593 (Life corruption from generation 1+) since both surface together. Bundling this into PR #594 because: - Same release target (v0.0.137 host-runtime hygiene) - Same class as #589's defensive WasmTrapError-contract work - Tiny scope (5-line guard + 1 test class) - Discovered while testing #594's #588 fix Tests added (TestHostSleepKeyboardInterrupt in test_runtime_traps.py): - Structural assertion: host_sleep wraps time.sleep in try/except KeyboardInterrupt with _VeraExit(130) raise - Behavioural test: synthetic KeyboardInterrupt → _VeraExit(130) conversion verified end-to-end with unittest.mock.patch KNOWN_ISSUES.md: adds #595 row for the residual macOS malloc abort. CHANGELOG entry expanded under [0.0.137] with both fixes. Test counts: 3,745 → 3,747 (+2 tests in test_runtime_traps.py). Co-Authored-By: Claude <noreply@anthropic.invalid>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@HISTORY.md`:
- Line 314: HISTORY.md is inconsistent: the release line shows "55 active
development days" while the header still shows "54"; update the header count
(the occurrence near the top that currently reads "54 active development days")
to "55 active development days" so both the header and the release line (the
string "Total: **810+ commits, 137 tagged releases, 55 active development
days.**") match.
In `@TESTING.md`:
- Line 11: Locate the table row text "| **Conformance programs** | 86 programs
across 9 spec chapters, validating every language feature |" and all other
numeric mentions of the conformance suite (the occurrences of "425" and "85"
elsewhere in TESTING.md), then determine the authoritative counts from the
actual conformance test suite (e.g., run the tally script or inspect the test
manifest) and make the document internally consistent by replacing those numbers
with the canonical values; ensure the same canonical counts are used in the
table row, the summary sentences, and any other places that mention conformance
tests/programs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: d68f0290-382c-4e97-944d-3151f862666d
⛔ Files ignored due to path filters (8)
docs/SKILL.mdis excluded by!docs/**docs/index.htmlis excluded by!docs/**docs/index.mdis excluded by!docs/**docs/llms-full.txtis excluded by!docs/**docs/llms.txtis excluded by!docs/**docs/sitemap.xmlis excluded by!docs/**tests/conformance/ch05_capture_array_index.verais excluded by!**/*.verauv.lockis excluded by!**/*.lock,!uv.lock
📒 Files selected for processing (14)
AGENTS.mdCHANGELOG.mdCLAUDE.mdFAQ.mdHISTORY.mdKNOWN_ISSUES.mdREADME.mdROADMAP.mdSKILL.mdTESTING.mdpyproject.tomltests/conformance/manifest.jsonvera/__init__.pyvera/wasm/closures.py
Two findings, both partially valid: 1. HISTORY.md line 3 (header) said "54 active development days" while line 314 (footer) said "55". v0.0.137 lands on 7 May, which is 55 days from the 22 Feb start — header now reads 55, matching the footer. 2. TESTING.md had stale "85 conformance" mentions on lines 352 and 394 (script-validation tables describing check_conformance.py). Updated both to "86 conformance" matching the post-#588 count. Skipped from the rabbit's flagged candidates: - Line 187 "425 tests" — actually correct (check_doc_counts verified 86 conformance, 425 test_conformance.py runs; not every conformance program runs at all 5 stages so 86 != 425/5). - Lines 234, 445 "85%" — those are wasm/inference.py coverage percentages, not conformance counts. Different number, valid as-is. check_doc_counts.py confirms consistent at 3747 tests, 86 conformance, 33 examples post-fix. Co-Authored-By: Claude <noreply@anthropic.invalid>
…nt-integration Reframed "What's next" to acknowledge the stabilisation work uncovered by the v0.0.119–v0.0.137 bug-killing campaign. Two patterns worth closing out before declaring runtime-correctness floor raised: 1. Scale-dependent bugs slip past the standard test suite (#515, #570, #487, now #593 — all required real-world programs at scale to surface; none would have been caught by the existing 3,747 small focused tests). 2. Walker-completeness gaps are silent failures of the same shape (#588 found _walk_free_vars missing 8 AST branches; we don't yet know if other walkers in the codebase have similar incompleteness). The "Implementation order" table is now split into two tiers: - Stabilisation tier (priority): #593, #596, #597, #595 - Agent-integration tier (resumes after): #222, #523, #370 #596 (stress-test harness) and #597 (walker-completeness audit) are new issues filed in this PR — they convert "I think we're stabilising" into evidence by building the infrastructure that prevents recurrence of the bug classes #588 and #593 surfaced. #593 retains the lead position because it's the largest open codegen unknown — bisection narrows it to 12x30 + recursive run_loop with allocating call argument, but the root cause hasn't been isolated. Until then, "WASM codegen is stable at scale" is unverified. #595 sits last in the stabilisation tier because it likely shares a root cause with #593 (both surface together at the same scale); fixing #593 may close it. The v0.0.137 KeyboardInterrupt guard already prevents the Python-traceback half. Order #5 (LSP) doesn't start until orders #1-#4 close (or are explicitly deferred with an open follow-up). Tier separator is explicit so future readers don't have to reverse-engineer the ordering rationale. Co-Authored-By: Claude <noreply@anthropic.invalid>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@TESTING.md`:
- Line 82: The coverage summary in test_runtime_traps.py described in the long
paragraph needs to include the new IO.sleep Ctrl-C suite: add a sentence or
clause mentioning the TestHostSleepKeyboardInterrupt test and its PR id (`#595`)
(and that it covers IO.sleep Ctrl-C / KeyboardInterrupt behavior) into the
existing TESTING.md summary paragraph for `test_runtime_traps.py` so the
narrative matches the increased test count and references the new suite by name
`TestHostSleepKeyboardInterrupt`, the feature `IO.sleep`, and PR `#595`.
In `@tests/test_runtime_traps.py`:
- Around line 2060-2100: Replace the local helper test (host_sleep_under_test)
with an end-to-end test that runs the real runtime path (call the same
entrypoint used in production such as cmd_run/execute or an IO.sleep invocation
via the codegen/runtime layer) instead of mimicking sleep logic; monkeypatch
time.sleep to raise KeyboardInterrupt, run the runtime with a program that calls
IO.sleep(...), and assert the process surfaces _VeraExit with code 130 (and that
no raw KeyboardInterrupt escapes). Use the real symbols IO.sleep and the
production entrypoint (cmd_run or execute) to exercise the full
import/trampoline path and keep checking for _VeraExit(130) as the observable
outcome. Ensure you remove host_sleep_under_test and adjust imports to use the
actual runtime entrypoint and _VeraExit from vera.codegen.api.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8a7c1098-e718-4309-9034-3e80b1e42bfc
📒 Files selected for processing (7)
CHANGELOG.mdHISTORY.mdKNOWN_ISSUES.mdROADMAP.mdTESTING.mdtests/test_runtime_traps.pyvera/codegen/api.py
…ep test Both findings valid; fixed both. 1. TESTING.md row 82 was missing the new TestHostSleepKeyboardInterrupt suite from its coverage prose, even though the test count and line count were already bumped. Appended a clause matching the existing per-suite style: names the suite, links #595 (the related residual issue this guards the in-our-control half of), summarises both tests in the class. 2. test_host_sleep_keyboard_interrupt_propagates_as_vera_exit was a "test-the-test" — it defined a LOCAL host_sleep_under_test mirror of the production code, patched time.sleep, and asserted the conversion. If the production code regressed (KeyboardInterrupt guard removed), this test still passed because it was checking the LOCAL copy. Replaced with test_host_sleep_keyboard_interrupt_end_to_end that: - Compiles a real Vera program calling IO.sleep(120) - Monkey-patches time.sleep to raise KeyboardInterrupt - Runs via execute() (the real production entrypoint) - Asserts ExecuteResult.exit_code == 130 - Asserts pre-sleep IO.print preserved (#522 contract) - Asserts post-sleep IO.print did NOT run Negative-test verified: temporarily reverting the production guard makes this test fail (wasmtime crashes when KeyboardInterrupt escapes the trampoline). Restoring the guard makes it pass again. So the test now genuinely pins the production path. Same "test-the-test" gap rabbit caught on PR #590's #589 e2e test. Fixing this one symmetrically — for #589 the gap was less tractable (would have required exporting wasmtime linker setup); here it was just a 30-line rewrite using execute() directly. test_runtime_traps.py grows from 2,100 to 2,138 lines (the e2e is slightly longer than the local-helper form). Test count unchanged (still 62: structural + e2e is the same shape as before). Co-Authored-By: Claude <noreply@anthropic.invalid>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@TESTING.md`:
- Line 76: Update the mismatched test count for the parametrized conformance
suite so both references match: change the inline comment that currently says
"425 tests" to "430 tests" (the same value shown in the table for
`test_conformance.py` and the parametrized conformance suite) so the comment and
table are synchronized for the `test_conformance.py` parametrized conformance
suite.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 416b3772-e007-494c-b622-4d2de70c8b25
📒 Files selected for processing (2)
TESTING.mdtests/test_runtime_traps.py
CodeRabbit on #598 caught that line 190's "Via pytest (parametrized -- 425 tests)" disagrees with the table at line 79 (which correctly says 430) and with pytest --collect-only (which reports 430 tests collected for tests/test_conformance.py). This is the same fix CodeRabbit flagged on PR #594 just before that PR merged. It was applied via the Edit tool at the time but landed in a local stash rather than the branch (the merge happened before the push) so the fix was effectively lost. Stash dropped as part of this commit. scripts/check_doc_counts.py confirms documentation counts are consistent post-fix; the canonical 430 = 86 conformance programs * 5 check phases (parse/check/verify/run/format-idempotency). Co-Authored-By: Claude <noreply@anthropic.invalid>
Summary
Closes #588.
Indexing a captured
Array<T>inside a closure body —coll[i]wherecollis a SlotRef to an outer-scope binding — produced invalid WASM. Element type didn't matter (Array<Int>,Array<Bool>,Array<Array<Bool>>all reproduced). Two failure modes by depth:unknown table 0: table index out of bounds(binary structurally invalid, wasmtime rejects before any instruction runs).undefined element: out of bounds table accessorindirect call type mismatch.vera checkaccepted both shapes — the failure was purely in codegen.Discovered via two independent paths:
BUG_REPORT.mdwith minimum reproducers and narrowing matrix.conformance-test-writeragent hit it while writing the apply_fn(closure, ()) on (Unit -> X) closure trips WASM type mismatch (phantom i64 param in call_indirect sig) #586 apply_fn test and worked around it by usingarray_lengthinstead.Root cause
_walk_free_varsinvera/wasm/closures.pyhad noIndexExprbranch.When the walker hit
coll[idx]inside a closure body, thecollSlotRef referencing a captured outer slot was never recognised as a free variable. The closure-lift'scaptureslist came back empty, body translation failed (the SlotRef couldn't be resolved against an empty capture-only env), and_compile_lifted_closurereturnedNone— but the call site at_translate_apply_fnhad already emitted acall_indirectto the now-absent function-table entry. Result: the module either failed validation (flat case, table empty) or trapped at runtime (nested case, dispatch landed in-bounds on a wrong-typed function).The asymmetry — call site emitted before lift attempt — meant a silent translation failure produced an unsigned-validation WAT.
Fix
Adds the missing
IndexExprbranch plus seven other AST node types that were silently falling through with the same bug class:IndexExprexpr.collection,expr.indexArrayLitInterpolatedStringExprpart (skippingstrparts)HandleExprbody, optionalstate.init_expr, eachclause.bodyandclause.state_update(with proper handler-clause param + handler-state @t scoping)AssertExpr/AssumeExprexprForallExpr/ExistsExprdomain+predicate(AnonFn — handled by existing branch)ModuleCallEach missing branch silently dropped capture references inside its sub-expressions. The fix is mechanical for all eight: walk the sub-expressions with appropriate scope-tracking.
Test plan
BUG_REPORT.mddocumented reproducers pass post-fix:repro_min.vera(flat case,Array<Int>)repro_nested.vera(nested case,Array<Array<Bool>>)repro_with_let.vera(workaround validation)tests/conformance/ch05_capture_array_index.veracovers three positions:array_mapwith captured@Array<Int>indexed in closure)array_mapi-of-array_mapiwith two-level array indexing through captured outer)IndexExprandFnCallarg)pytest tests/— 3,731 passed, 14 skipped (no new failures).mypy vera/clean.Residual issue (out of scope, filed separately as #593)
The bug report acknowledged that the full Conway's Life implementation at 12×30 scale may have additional triggers beyond captured-array indexing. Confirmed: the BUG_REPORT.md
repro_min/repro_nestedreproducers pass post-fix, AND a full Life subset at 3×3 / 5×5 / 12×30 (one step) all run cleanly. However, the full Life with recursiverun_loopover 200 generations at 12×30 still corrupts string output from generation 1+ (visible as U+FFFD characters interleaved with valid block characters, the v0.0.136errors="replace"defensive layer prevents this from escaping as a Python traceback).Bisection narrows the trigger to: 12×30 scale + recursive
run_loopwith allocatingnext_grid(...)argument across the recursive call boundary. Smaller scales work cleanly. Filed as #593 with hypothesis (GC reclamation of in-flight strings duringstring_join/string_concat); tracked inKNOWN_ISSUES.md.Doc updates
KNOWN_ISSUES.md— drops Indexing a captured Array<T> inside a closure body produces invalid WASM #588 row; adds Conway's Life at 12x30 still corrupts strings from gen 1 onwards (additional trigger beyond #588) #593 row for the residual.CHANGELOG.md— new[0.0.137]entry underFixedfor Indexing a captured Array<T> inside a closure body produces invalid WASM #588.HISTORY.md— Stage 11 row for v0.0.137.vera/__init__.py,pyproject.toml,README.md,docs/index.html) bumped 0.0.136 → 0.0.137.TESTING.md,CLAUDE.md,SKILL.md,AGENTS.md,FAQ.md,ROADMAP.mdupdated 85 → 86 / 3,740 → 3,745.uv.lockregenerated for the version bump.🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation