Conversation
Add a typed AST layer (frozen dataclass nodes with source spans) and a Lark Transformer that converts parse trees into AST. This is the foundation for the type checker (Phase C3). - vera/ast.py: ~50 node classes covering all grammar constructs - vera/transform.py: ~86 transformer methods, bottom-up conversion - vera/cli.py: `vera ast` and `vera ast --json` commands - vera/parser.py: parse_to_ast() convenience function - vera/errors.py: TransformError for transformation failures - tests/test_ast.py: 83 tests (round-trip, node-specific, span, serialisation) - README.md, SKILLS.md, CHANGELOG.md: updated documentation 193 tests pass (110 parser + 83 AST). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
aallan
added a commit
that referenced
this pull request
Mar 31, 2026
Add a 'State handler with a loop helper' subsection showing the most common State pattern: handle[State<Int>] wrapping a where-block helper that has effects(<State<Int>>). This pattern was the #1 undocumented syntax that LLMs fail to produce. VeraBench testing shows: - Without this example: Claude Sonnet 4 fails check on both VB-T5-004 (State accumulator) and VB-T5-008 (IO print loop) — can't produce valid where-block + effect syntax - With this example: both problems pass check (100%) The example covers: pure outer function with handler, effectful where- block helper with decreases clause, pure helper called from handler body, with clause for state updates, and visibility rules for where blocks. Relates to: aallan/vera-bench#15 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 tasks
aallan
added a commit
that referenced
this pull request
Mar 31, 2026
* Add State handler + where block example to SKILL.md Add a 'State handler with a loop helper' subsection showing the most common State pattern: handle[State<Int>] wrapping a where-block helper that has effects(<State<Int>>). This pattern was the #1 undocumented syntax that LLMs fail to produce. VeraBench testing shows: - Without this example: Claude Sonnet 4 fails check on both VB-T5-004 (State accumulator) and VB-T5-008 (IO print loop) — can't produce valid where-block + effect syntax - With this example: both problems pass check (100%) The example covers: pure outer function with handler, effectful where- block helper with decreases clause, pure helper called from handler body, with clause for state updates, and visibility rules for where blocks. Relates to: aallan/vera-bench#15 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: update allowlist and site assets for SKILL.md State handler section The 51-line "State handler with a loop helper" insertion shifted all code-block line numbers after line 1213, making 19 allowlist entries in check_skill_examples.py stale. Update them to their new positions, remove entries whose blocks now parse cleanly, and add the one new entry needed (Correct: import syntax example at 1681). Also clarify the "Pure helper functions" key point: add_value is called from the `where` block helper (sum_loop), not from the handler body directly. Regenerate docs/ site assets to include the new SKILL.md section. Co-Authored-By: Claude <noreply@anthropic.invalid> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Claude <noreply@anthropic.invalid>
This was referenced Apr 16, 2026
aallan
added a commit
that referenced
this pull request
Apr 23, 2026
Both findings real, both fixed. Predicate hoist (#2): - Extracted the canonical ASCII whitespace predicate into a module-level helper emit_is_ascii_whitespace(byte_local, indent) in vera/wasm/helpers.py. Single-source-of-truth for the branchless {space(32) | (byte - 9) < 5} encoding of Python's str.isspace() ASCII set {tab(9), LF(10), VT(11), FF(12), CR(13), space(32)}. Replaced four open-coded copies with calls to the helper: 1. _translate_is_whitespace (~10 LOC → 1 line) 2. _is_ws_inline closure inside _translate_trim 3. _translate_structural_split count pass (string_words) 4. _translate_structural_split emit pass (string_words) - Helper docstring documents the historical context (PR #510 round 2 caught _translate_strip's divergent narrow predicate; round 4 hoisted the helper) and warns against re-encoding the byte set. - Direct application of the byte-literal-collision review rule (memory file feedback_review_heuristics.md): two functions sharing a predicate routed through one helper from day 1 prevents future divergence. Browser regression (#1): - New tests/test_browser.py::TestBrowserStringUtilities:: test_string_strip_vt_ff: compiles and runs a program that calls string_strip("\\u{0B}\\u{0C}hi \\u{0B}") and asserts the output is "hi". Pins the strip→trim delegation contract under the browser runtime — if a future refactor accidentally re-opens the old narrow predicate for strip, this assertion breaks. All 3,487 tests pass; mypy clean; doc count check + site asset check + spec/SKILL/conformance/example checks all green. Co-Authored-By: Claude <noreply@anthropic.invalid>
aallan
added a commit
that referenced
this pull request
Apr 23, 2026
Two new issues surfaced by the same agent run: - #515: $gc_collect itself faults with out-of-bounds memory access under sustained allocation pressure. 40×20 Conway's Game of Life over 200 generations reliably reproduces. The collector walks past $heap_ptr to the linear-memory bound and traps — gc_collect is the top frame of the crashing stack, not the program. This is distinct from #487 (alloc path) and #484 (sweeper 16-bit truncation). - #516: Runtime traps bubble up as raw wasmtime stack traces; the CLI mis-labels every Trap/WasmtimeError as "Runtime contract violation" even when the actual cause is out-of-bounds memory access, integer overflow, unreachable, etc. No source-line mapping, no actionable guidance — the exact opposite of the carefully crafted compile-time diagnostics. Three-stage proposed scope: categorise trap reason, source-map the Vera function that trapped, specialise help for common trap classes. KNOWN_ISSUES.md updated with rows for both. ROADMAP.md implementation-order table re-sorted: 1. #514 (nested closures + captured-scalar indirection) 2. #515 (GC collect faults) 3. #516 (runtime trap diagnostics) 4. #475 (WASM translator bug cleanup — previously #1) 5. #507 (Eq/Ord-dispatched array ops) The top three are all surfaced by the single Game of Life agent run — the empirical signal is the clearest prioritisation we've had. #475 was promoted to #1 after PR #511 merged; pushed to #4 as the new issues outrank it on agent-adoption impact. Agent-noted quote, worth capturing as the motivation for #516 specifically: "the gap between 'the type system is happy' and 'the compiled artefact actually runs' is wider than you'd expect from a language with SMT-verified contracts. The verifier can prove your termination argument is sound while the codegen silently miscompiles your closure environment out from under you." Also posted comment on #514 documenting the third shape the agent is now isolating (closure body with i32/i64 type mismatch, distinct from direct-nesting and captured-scalar-indirection). No code changes. Co-Authored-By: Claude <noreply@anthropic.invalid>
6 tasks
aallan
added a commit
that referenced
this pull request
Apr 23, 2026
CodeRabbit caught two copies of the same BROKEN/WORKING nested-closure example pair — one labeled (1) and one unlabeled but otherwise identical. Dropped the unlabeled duplicate; kept the (1)-labeled copy as the canonical example for the 'Known limitation: nested closures and captured-scalar indirection' subsection. scripts/check_skill_examples.py: cleaned up ALLOWLIST (fix_allowlists re-introduced duplicate keys per the memory'd bug; manually deduped and re-verified 66 unique entries, 0 duplicates). Co-Authored-By: Claude <noreply@anthropic.invalid>
aallan
added a commit
that referenced
this pull request
May 7, 2026
…nt-integration Reframed "What's next" to acknowledge the stabilisation work uncovered by the v0.0.119–v0.0.137 bug-killing campaign. Two patterns worth closing out before declaring runtime-correctness floor raised: 1. Scale-dependent bugs slip past the standard test suite (#515, #570, #487, now #593 — all required real-world programs at scale to surface; none would have been caught by the existing 3,747 small focused tests). 2. Walker-completeness gaps are silent failures of the same shape (#588 found _walk_free_vars missing 8 AST branches; we don't yet know if other walkers in the codebase have similar incompleteness). The "Implementation order" table is now split into two tiers: - Stabilisation tier (priority): #593, #596, #597, #595 - Agent-integration tier (resumes after): #222, #523, #370 #596 (stress-test harness) and #597 (walker-completeness audit) are new issues filed in this PR — they convert "I think we're stabilising" into evidence by building the infrastructure that prevents recurrence of the bug classes #588 and #593 surfaced. #593 retains the lead position because it's the largest open codegen unknown — bisection narrows it to 12x30 + recursive run_loop with allocating call argument, but the root cause hasn't been isolated. Until then, "WASM codegen is stable at scale" is unverified. #595 sits last in the stabilisation tier because it likely shares a root cause with #593 (both surface together at the same scale); fixing #593 may close it. The v0.0.137 KeyboardInterrupt guard already prevents the Python-traceback half. Order #5 (LSP) doesn't start until orders #1-#4 close (or are explicitly deferred with an open follow-up). Tier separator is explicit so future readers don't have to reverse-engineer the ordering rationale. Co-Authored-By: Claude <noreply@anthropic.invalid>
6 tasks
aallan
added a commit
that referenced
this pull request
May 7, 2026
1. HISTORY.md "137 tagged releases" -> 138 (line 315): the v0.0.138 row was added but the rollup footer count still said 137. 2. ROADMAP.md stabilisation-gate sentence: "order #5 starts when #1-#4 are closed" -> "order #4 starts when #1-#3 are closed". The earlier deletion of the #593 row (now closed) shifted the stabilisation tier to three items; the agent-integration tier numbering is also renumbered 5/6/7 -> 4/5/6 to match. 3. New test: test_eager_gc_array_mapi_with_non_allocating_string_closure mirrors the existing array_map String eager-GC test for the array_mapi unwind path with a fn(@Bool, @nat -> @string) closure returning pick(@Bool.0). The existing array_mapi ADT test (test_array_mapi_non_allocating_closure_emits_balanced_push) covered the i32 branch; this new test covers the i32-pair branch. 4. Tighten the structural #593 predicate. Pre-fix it accepted any "global.set $gc_sp" + absent "call $alloc"; CodeRabbit observed that a bare "global.set $gc_sp" could in principle be a stack- restore rather than a push. In practice the prologue's gc_sp_save/restore is gated on ctx.needs_alloc and the regex already excludes call $alloc, so a non-allocating body cannot contain a restore -- but the literal predicate was weaker than the intent. Adds _has_gc_shadow_push helper that anchors on the canonical "i32.const 4 / i32.add / global.set $gc_sp" SP-increment sequence emitted by gc_shadow_push (vera/wasm/helpers.py). Used by both the i32-pair and i32-ADT structural tests; a non-push artefact in the body can no longer satisfy the assertion. Plus: HISTORY.md tooling section gains a v0.0.138 row for VERA_EAGER_GC alongside the existing tooling milestones (cross- linked to ENVIRONMENT.md, which already has the full reference section for it). Suite: 3,747 passed (+1 from new array_mapi String test), 14 skipped. Co-Authored-By: Claude <noreply@anthropic.invalid>
3 tasks
aallan
added a commit
that referenced
this pull request
May 7, 2026
Two Conway's Life agent experiments on --target browser surfaced five concrete blockers (issues #602/#603/#604/#609/#610) and an explicit design memo (#608) mapping each obstacle to a concrete runtime-only fix. The current stabilisation framing covers codegen reliability and walker completeness but treats browser- target reliability as a separate concern. The agent's diagnosis makes that split harder to defend: "write once, run anywhere" is currently true for pure computation and approximate-to-false for anything with timing or screen output. Two of the fixes (#609 JSPI-driven IO.sleep, #610 ANSI subset interpreter) close the timing and rendering halves of the seam without language changes -- adding them to the stabilisation tier commits to "browser-target Vera is something you'd actually use" before the agent-integration push. Changes: - Expanded the campaign-pattern list from two patterns to three; third pattern documents the browser-target seam with links to the umbrella issue (#608) and the five concrete blockers. - Added #609 (JSPI-driven IO.sleep) as item 3 in the stabilisation tier, with rationale + the WebAssembly JSPI / Asyncify mechanism. - Added #610 (ANSI subset interpreter) as item 4 in the same tier, paired with #609 -- together they close the seam and let life.vera (terminal version) run unchanged on --target browser. - Renumbered #595 from item 3 to item 5 (now last in stabilisation, since it's contingent on an upstream wasmtime-py release). - Renumbered the agent-integration tier items 4/5/6 -> 6/7/8 to match. - Updated the "What moves when" gate from "#4 starts when #1-#3 are closed" to "#6 starts when #1-#5 are closed". Co-Authored-By: Claude <noreply@anthropic.invalid>
9 tasks
aallan
added a commit
that referenced
this pull request
May 8, 2026
…ents + CodeRabbit Four review agents (code-reviewer, pr-test-analyzer, comment-analyzer, silent-failure-hunter) plus three CodeRabbit comments produced a batch of cross-cutting findings. This commit addresses them all in one pass to avoid the iterative round-trip cost the bug class itself was filed to eliminate. **Critical fixes:** 1. **Closure body E615 harvest gap (silent-failure-hunter C1).** `_compile_lifted_closure` constructed its own WasmContext but never harvested `_interp_inference_failures` — closure bodies with E615-triggering interpolation segments were silently dropped, producing call_indirect→missing-table-entry WASM validation traps with no source-located diagnostic. Extracted the harvest into `CodeGenerator._harvest_interp_inference_failures` on the codegen base and called it from both functions.py and closures.py. Removed the now-disproved `# pragma: no cover — defensive` claim on closures.py's body-instrs-None path. New regression test `test_e615_in_closure_body_emits_diagnostic`. 2. **`_canonical_named_type` type_args from terminal NamedType (CodeRabbit #3 + code-reviewer I1).** The walker's `outer_type_args` capture rule (always read from first NamedType) lost type_args when alias_map substitution bound a generic param to a parameterised type. Empirically reachable today via `type Mapper<A, B> = fn(A -> B); Mapper<Int, Array<Int>>` etc. Fix: drop the "outermost" rule, return type_args from the terminal NamedType reached. All 21 existing #602-class regression tests still pass. New regression test `test_canonical_named_type_terminal_args_propagation`. 3. **HISTORY.md footer (CodeRabbit #1).** 141 tagged releases → 142 to match the v0.0.142 row added in PR #631. **UX fix:** 4. **Multiple E615 per function (silent-failure-hunter H2).** Pre-this-fix `_translate_interpolated_string` returned None on the first failing segment, requiring N round-trips for N bad segments in one interpolation. Now collects every failing segment via `had_failure` flag, then bails at the end. New regression test `test_multiple_e615_in_one_interpolation`. **Doc-quality fixes (comment-analyzer):** 5. **Trigger-count drift** in `operators.py:491-494` — "every canonicalisation gap (#602 and ten subsequent triggers)" implied 11; rewrote to "the ten triggers of the #602 bug class accumulated across PRs #627 + #629". 6. **`_format_named_type_canonical` docstring drift** — claim "matches the pre-#630 fallback shape" was empirically wrong (pre-#630 resolved through aliases, post-#630 fallback doesn't). Rewrote to acknowledge the deliberate behavioural change. 7. **"Future closure-arg shapes plug in here without further dispatch ladder"** — overstatement. Adding a `FnCall` returning a closure still needs an `elif`; what's saved is canonicalisation logic, not dispatch logic. `IfExpr` between closures has no single `return_type` field — example was misleading. Rewrote in both `_infer_fncall_vera_type` and `_infer_apply_fn_return_type`. 8. **`_canonical_wasm_type` docstring** — added explicit `Unit → None` documentation; clarified that `None` (no WASM type) and `"i64"` (unreachable-NamedType default) are distinct return shapes that callers must not conflate. **Tests added (besides the regressions tied to fixes):** 9. **`test_per_function_isolation_of_failures_list`** — pins that a clean function and a dirty function in the same source produce independent `_interp_inference_failures` lists. 10. **`test_e615_fires_on_result_in_interpolation`** — adjacent ADT shape (Result vs Option) so a future Option-handling broadening doesn't accidentally regress the parallel Result path. 11. **`test_array_map_refinement_returning_closure`** — pins the previously-unaudited `_infer_closure_return_vera_type` path in `calls_arrays.py`, which the #630 migration broadened to handle refinements. **Test enhancements:** 12. **E615 ordering before matching E602** in the existing test (CodeRabbit #2) — pinned via `warnings.index(e615[0])` < `warnings.index(matching_e602)`. Filtered to the E602 that mentions the offending function so prelude-skip E602s don't confound the assertion. 13. **E615 source-location attached** in the existing test — softened to "line > 0" (any location) because SlotRef-inside- InterpolatedString spans are unreliable today (#634, separate follow-up). Tightening to "line points at the segment" is the natural acceptance test for that follow-up. **Skipped with reasons:** - **Cycle-guard regression on `_resolve_base_type_name`** — test-analyzer C1 / code-reviewer S4. Verified pre-existing and dead-code-safe today (cycles crash an upstream resolver first). Filed as #633. - **`_canonical_wasm_type`'s `i64` default for unhandled apply_fn / call_indirect shapes** — silent-failure-hunter H1. Bigger refactor parallel to Tier 2 but for the WASM-side dispatch. Filed as #632. - **SlotRef-inside-InterpolatedString span fidelity** — surfaced when writing the E615 source-location assertion. Diagnostic-quality concern, separate from #630's canonicalisation scope. Filed as #634. - **`_canonical_named_type` worked-example docstring** (comment-analyzer I6) — superseded by fix #2 above (semantics changed; docstring rewritten with concrete examples). - **Per-context isolation prose-vs-test** (comment-analyzer I4) — addressed via fix #9 (test added). - **`_resolve_i32_pair_ret_te` 30-line upper docstring** — comment-analyzer style nit. The prose is load-bearing context (#628 cross-module narrative); trimming would lose history without obvious gain. Defer. **Follow-ups:** #632 (apply_fn / call_indirect E616), #633 (_resolve_base_type_name cycle guard), #634 (interpolation SlotRef source spans). Added to KNOWN_ISSUES.md alongside #628. Counts: 3,792 tests (3,778 passed, 14 skipped) — six new tests under `TestE615LoudInterpolationFallthrough630`. CHANGELOG + KNOWN_ISSUES + ROADMAP + TESTING.md + HISTORY footer all synced. mypy clean. Co-Authored-By: Claude <noreply@anthropic.invalid>
8 tasks
aallan
added a commit
that referenced
this pull request
May 11, 2026
CodeRabbit symmetry catch: the `CodegenInvariantError` catch-block in `_compile_fn` emits the `[E699]` warning without first calling `self._harvest_interp_inference_failures(ctx)`, whereas the parallel `CodegenSkip` catch-block does call it before the `[E602]` warning. Empirically invariant violations fire early — before any interpolation translation has populated `ctx._interp_inference_failures` — so this is mostly insurance. But keeping the two handlers structurally identical is the right principle: any time we add a new failure side-channel above either path, we want both to drain it. Added the harvest call plus a comment explaining the symmetry reasoning. Refs #626 #658 Co-Authored-By: Claude <noreply@anthropic.invalid>
aallan
added a commit
that referenced
this pull request
May 11, 2026
Five findings from CodeRabbit's review of #659 — three 🟠 Major correctness issues and two 🟡 Minor quality-of-life improvements. All five verified valid against current code and addressed minimally. CR-1 (`scripts/check_e602_clean.py:174`) — 🟠 `_extract_skips()` filtered only `("E602", "E604")` but the CHANGELOG entry and the suppression pass in `vera/codegen/core.py` both treat `[E605]` as part of the same template-warning surface. A future E605 silent skip would have slipped past the gate. Added `"E605"` to the filter tuple + inline comment cross-referencing the suppression pass. CR-2 (`tests/test_codegen.py:15880`) — 🟡 `test_mono_suffix_correct_for_slotref_fn_alias_arg` asserted WAT suffix strings but didn't execute the compiled module — a regression that produced correct WAT but broken indirect-call wiring would slip past. Added `execute(result, fn_name="main")` + `assert exec_result.value == 20` to pin the runtime contract. Also added a sibling test `test_parameterised_alias_substitutes_type_args` exercising the CR-4/CR-5 fix with the same execute+assert shape (21 for `Some(7) * 3`). CR-3 (`vera/codegen/core.py:552`) — 🟠 Pre-fix the template-warning suppression checked `_fn_sigs` for the existence of mono clones. But `_fn_sigs` is populated in Pass 1.5 (registration) *before* any mono body is compiled — so a clone that `_compile_fn(mdecl)` later rejects with `None` would still be counted as "compiled" and wrongly suppress its template's diagnostic. That hides the only pre-runtime signal for a genuinely-broken generic. Introduced `compiled_mono_bases: set[str]` populated only inside the `if fn_wat is not None:` branch of the mono-clone compile loop. Suppression now uses `compiled_mono_bases & forall_decl_names`. CR-4 (`vera/codegen/monomorphize.py::_resolve_arg_fn_shape`) — 🟠 For a *parameterised* FnType alias like `type Mapper<T> = fn(T -> T)`, `@Mapper<Int>.0` resolved to the raw alias body `fn(T -> T)` without substituting `T → Int`. Downstream `_infer_fn_alias_type_args` then bound alias-local names (`A → T`, `B → T`) instead of concrete ones, producing mono suffixes like `option_map$T_T` rather than `option_map$Int_Int`. Fixed by substituting the SlotRef's `type_args` into the alias body before returning. CR-5 (`vera/wasm/calls.py::_resolve_arg_fn_shape_wasm`) — 🟠 Same gap on the WASM call-site rewriting side. Same fix applied. Shared substitution helper Rather than duplicate the type-var substitution logic in two codegen modules, extended the existing module-level `substitute_type_vars` in `vera/wasm/inference.py` (used by `InferenceMixin` canonicaliser + `CodeGenerator._type_expr_to_wasm_type`) to also handle `FnType` (it previously handled only `NamedType` and `RefinementType`). Both `_resolve_arg_fn_shape` helpers now import + use the canonical implementation — same semantics across all four callers, no drift risk. Validation - mypy: clean (59 source files) - pytest: 3,794 passed, 14 skipped (was 3,793 + 1 new test) - e602 gate: 116 files clean, 6 allowlist entries matched, 0 stale - All 86 conformance programs pass; all 34 examples pass `vera check` + `vera verify` Refs #604 #655 #659 Co-Authored-By: Claude <noreply@anthropic.invalid>
aallan
added a commit
that referenced
this pull request
May 13, 2026
CR-1 (`CONTRIBUTING.md:90`) — commit-stage hook count drift. The heading "**commit-stage** hooks (24, each gated to relevant files) include:" disagreed with the immediately-preceding sentence "25 are configured at the commit stage". Updated to match (24 -> 25). Pure doc-consistency fix. CR-2 (`scripts/check_walker_coverage.py:167`) — auto-discover walker files instead of hardcoded list. The hardcoded `WALKER_FILES` list silently skipped any new walker file added without updating the list — replicating the exact silent-skip class this script was written to close. Verified by inspection: nothing forced contributors adding a new walker file to also touch this list. Replaced with `_discover_walker_files()` which globs `vera/**/*.py` for files containing the `WALKER_COVERAGE:` marker. Returns a sorted `list[Path]` (same shape as the hardcoded list it replaces) so downstream code referencing `WALKER_FILES` is unchanged. Validation - `python scripts/check_walker_coverage.py` -> 9 walkers cover all 29 Expr subclasses (clean — same output as pre-fix) - Discovered file list matches the pre-fix hardcoded list exactly (6 files, same paths) This also addresses the #1 critical finding from the pr-review-toolkit audit run before this commit. Refs #597 #668 Co-Authored-By: Claude <noreply@anthropic.invalid>
This was referenced May 13, 2026
aallan
added a commit
that referenced
this pull request
May 13, 2026
**Finding 1: substring-only WAT checks could false-pass** (valid).
The previous tests asserted constants like `0x80000000`, `i32.or`,
`0x7FFFFFFF`, and `i32.ge_u` in isolation. But:
- `i32.const 0x80000000` appears in BOTH the wrap-site tag AND
the heap-ceiling guard. Stripping the wrap OR would still
leave the substring (from the guard) and the test would pass.
- `i32.or` also appears in header-mark manipulation, unrelated
to the wrap path.
Switched all three tests to boundary-safe adjacent-sequence
regex:
- `test_wrap_emits_tag_or`: `re.search(r"i32\.const 0x80000000\s+i32\.or", ...)`
- `test_unwrap_emits_mask_and`: `re.search(r"i32\.load\s+offset=4\s+i32\.const 0x7FFFFFFF\s+i32\.and", ...)`
- `test_alloc_emits_heap_ceiling_guard`: extract `$alloc` via
`re.search(r"\(func \$alloc\b", ...)` (boundary-safe vs the
previous `find("(func $alloc")` which could false-match
`$alloc_xxx`), then assert the EXACT ordered 8-instruction
guard sequence (`global.get $heap_ptr; local.get $total;
i32.add; i32.const 0x80000000; i32.ge_u; if; unreachable;
end`) via a single multi-line regex with `re.DOTALL`.
**Finding 2: _wrap_handle doesn't check raw_handle < 0x80000000**
(valid). If a handle ever has bit 31 set, `raw_handle |
0x80000000` is a no-op and the unwrap mask `& 0x7FFFFFFF`
returns the WRONG handle — silent host-store-lookup corruption
that's exactly the latent failure class #578 itself sought to
eliminate. Practically infeasible (counters bounded well below
2^31; a 2B-handle session is wall-clock-infeasible) but the
invariant deserves an explicit assertion per DESIGN.md principle
#1 (checkability). Added `if raw_handle & 0x80000000: raise
RuntimeError(...)` immediately before the `_write_i32` /
`_call_register_wrapper` block in
`vera/codegen/api.py::_wrap_handle`. Error includes
raw_handle, kind, body_ptr per CR's exact recommendation.
**Not addressed: CR's 3rd finding**
("Missing the long-run high-handle-counter regression that
#578 is meant to prevent" — labelled Major, Heavy lift).
Declined with reasoning:
1. **#578 was a RETENTION bug, not a correctness bug.** Pre-fix
the conservative scan could mark unrelated heap objects when
a handle value matched the heap-range check. In an
IMMUTABLE language those objects can't corrupt state — only
memory grows. No output assertion can distinguish pre-fix
from post-fix at modest scale.
2. **A true regression test requires heap-growth inspection.**
Vera's test infra doesn't expose `$heap_ptr` post-run; adding
that machinery is a separate infrastructure change. An
alternative — measure WASM `memory.size` deltas — needs the
same instrumentation.
3. **The structural invariants are mathematically sufficient.**
The 3 tightened tests + `$alloc` heap-ceiling guard prove
by construction that no tagged value can EVER match the
heap-range check: heap_ptr is hard-capped at 0x80000000, and
every tagged value is >= 0x80000000. Sets don't overlap.
This is a stronger guarantee than a stochastic
non-occurrence stress test could provide.
4. **A weaker "scale" test would be misleading.** 200K wrap
allocations under eager GC complete fine pre-fix too (the
retention is bounded and self-clearing once the falsely-
retained object becomes legitimately unreachable in a later
cycle). Such a test would pass on broken code, providing
false confidence.
If we add `$heap_ptr` inspection to the test harness in future
(a worthwhile general improvement), I'd be happy to file a
follow-up that includes a deterministic retention regression.
Tracked as a possible follow-up issue.
Validation: pytest 3868/3868, mypy clean, ruff S clean,
doc-counts consistent.
Co-Authored-By: Claude <noreply@anthropic.invalid>
9 tasks
aallan
added a commit
that referenced
this pull request
May 22, 2026
1. host_md_parse exception scope (#693 inline #1): the broad except Exception was swallowing host-side invariant violations (shadow-stack overflow from _ShadowGuard, ValueError from write_md_block exhaustive match, internal AssertionErrors). Restructured to put only the parse step inside the try; shadow-stack work + write_md_block + Result.Ok alloc now sit outside, so invariant violations propagate as wasmtime traps rather than being repackaged as user-domain parse errors. Mirrors the narrow-catch pattern in host_html_parse and host_json_parse. 2. Redundant guard.push in markdown.py walkers (#693 inline #2): the _write_inline_array / _write_block_array / _write_array_of_block_arrays / _write_table_data helpers already push their backing buffer onto the shadow stack internally. Removed the duplicate guard.push(arr_ptr) calls immediately after these helpers in write_md_inline (MdEmph, MdStrong, MdLink) and write_md_block (MdParagraph, MdHeading, MdBlockQuote, MdList, MdTable, MdDocument). Functionally equivalent (the entry IS rooted, just twice) but halves shadow-stack consumption from those branches and removes the only correct way to overflow the guard on pathologically deep walks. Pushes after alloc_string calls are KEPT - alloc_string does not push. 3. TESTING.md "435 tests" -> "440 tests" (outside-diff): the pytest-parametrized shell-block comment now matches the table entry and the new 88-program x 5-stage total. 4. Tracking: added #694 (Windows test_browser timeout flake) to KNOWN_ISSUES.md "Test coverage gaps" with the symptom, asymmetry-test diagnostic that proves it is a flake (3.11 and 3.13 pass on the same Windows runner where 3.12 timed out), and the one-line fix (bump timeout=30 to timeout=60 at the two sites). Per the PR-generated-new-issues-must-be- tracked-in-the-same-PR rule. Validation: 3,815 pytest pass + 88 conformance + mypy clean + all doc validators green + limitations sync consistent. Co-Authored-By: Claude <noreply@anthropic.invalid>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
TransformErrorfor unhandled rulesvera ast <file>(indented text) andvera ast --json <file>(JSON output)Test plan
vera ast examples/factorial.veraprints readable indented textvera ast --json examples/factorial.veraproduces valid JSONto_dict()andpretty()work on all node types🤖 Generated with Claude Code