Decompose vera/wasm/calls.py into 8 subsystem mixins (#418)#474
Conversation
Scaffolding for splitting the 8,390-line calls.py into 8 domain mixins. This commit only adds empty mixin files and wires them into WasmContext via multiple inheritance — no methods moved yet. Subsequent commits move methods one domain at a time. Co-Authored-By: Claude <noreply@anthropic.invalid>
Moves 18 methods to the new CallsMathMixin: - Math: abs, min, max, floor, ceil, round, sqrt, pow - Numeric conversions: to_float, float_to_int, nat_to_int, int_to_nat, byte_to_int, int_to_byte - Float predicates: is_nan, is_infinite, nan, infinity calls.py: 8,390 -> 7,948 lines (-442) calls_math.py: 0 -> 457 lines Also adds calls_math and the other new calls_* modules to the mypy mixin override list in pyproject.toml. Co-Authored-By: Claude <noreply@anthropic.invalid>
Moves 17 methods to the new CallsMarkupMixin: - JSON: json_parse, json_stringify - HTML: html_parse, html_to_string, html_query, html_text - Markdown: md_parse, md_render, md_has_heading, md_has_code_block, md_extract_code_blocks - Regex: regex_match, regex_find, regex_find_all, regex_replace - Async: async, await (identity ops) calls.py: 7,948 -> 7,652 lines (-296) calls_markup.py: 19 -> 315 lines Co-Authored-By: Claude <noreply@anthropic.invalid>
Moves 6 methods to the new CallsHandlersMixin: - Show ability: _translate_show + _SHOW_DISPATCH class attr - Hash ability: _translate_hash, _translate_hash_string - Effect handlers: _translate_handle_expr, _translate_handle_state, _translate_handle_exn calls.py: 7,652 -> 7,292 lines (-360) calls_handlers.py: 14 -> 379 lines Co-Authored-By: Claude <noreply@anthropic.invalid>
Moves 5 methods to the new CallsArraysMixin: - array_length, array_append, array_range, array_concat, array_slice _infer_concat_elem_type stays in calls.py core — it's called by both arrays and (future) string split/join methods. calls.py: 7,292 -> 6,734 lines (-558) calls_arrays.py: 19 -> 576 lines Co-Authored-By: Claude <noreply@anthropic.invalid>
Moves 27 methods to the new CallsContainersMixin (all three opaque-handle types using the host-import pattern): - Decimal: 9 methods (register_decimal_import + 8 translators) - Map: 11 methods + 5 helpers (map_wasm_tag, map_wasm_types, split_map_type_args, infer_map_key/value_from_map_arg, etc.) - Set: 7 methods + 2 helpers calls.py: 6,734 -> 6,124 lines (-610) calls_containers.py: 16 -> 627 lines Co-Authored-By: Claude <noreply@anthropic.invalid>
Moves 4 methods to the new CallsParsingMixin: - parse_nat, parse_int, parse_bool, parse_float64 All emit state-machine loops in WAT that allocate Result<T, String> on the heap with interned error messages from string_pool. calls.py: 6,124 -> 5,168 lines (-956) calls_parsing.py: 15 -> 971 lines Co-Authored-By: Claude <noreply@anthropic.invalid>
Moves 6 methods to the new CallsEncodingMixin: - Base64: base64_encode, base64_decode - URL: url_encode, url_decode, url_parse, url_join All emit heap-allocating state machines in WAT and return Result/UrlParts ADTs. calls.py: 5,168 -> 3,135 lines (-2,033) calls_encoding.py: 14 -> 2,048 lines Co-Authored-By: Claude <noreply@anthropic.invalid>
Moves 20 methods to the new CallsStringsMixin: - Basic: string_length, string_concat, string_slice, string_char_code, string_from_char_code, string_repeat - Search: string_contains, string_starts_with, string_ends_with, string_index_of - Transform: string_upper, string_lower, string_replace, string_split, string_join, string_strip - To-string conversions: to_string (Int), bool_to_string, byte_to_string, float_to_string calls.py: 3,135 -> 565 lines (-2,570) calls_strings.py: 23 -> 2,588 lines calls.py is now the target size — just the dispatcher, type inference helpers, and _infer_concat_elem_type. Co-Authored-By: Claude <noreply@anthropic.invalid>
- Remove now-unused helper imports (_element_mem_size, _element_store_op, _is_pair_element_type, gc_shadow_push) from calls.py — they moved to the domain mixins. - Update CallsMixin class docstring to document the new role (dispatch + shared helpers) and list sibling mixins. - Remove three stale section comments that lost their neighbouring dispatch branches during the split. ## Final structure | File | Lines | Purpose | |------|-------|---------| | calls.py | 572 | Main dispatcher + type inference helpers | | calls_math.py | 457 | abs/min/max/sqrt/pow/numeric conversions | | calls_markup.py | 315 | JSON/HTML/Markdown/Regex/async | | calls_arrays.py | 576 | array_length/append/range/concat/slice | | calls_handlers.py | 379 | Show/Hash/handle_state/handle_exn | | calls_containers.py | 627 | Map/Set/Decimal (opaque-handle types) | | calls_parsing.py | 970 | parse_nat/int/bool/float64 | | calls_encoding.py | 2,047 | base64/url | | calls_strings.py | 2,588 | All string ops + to_string conversions | calls.py: 8,390 -> 572 lines (-93%) Full test suite passes: 3,242 tests, 73 conformance, 30 examples. Co-Authored-By: Claude <noreply@anthropic.invalid>
📝 WalkthroughWalkthroughThis PR decomposes WebAssembly call translation for the Vera compiler by introducing eight new mixin classes— Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Rationale: Eight new files introducing ~6500 lines of heterogeneous low-level WASM instruction generation across distinct builtin categories. Each mixin follows a consistent translator pattern (argument evaluation, conditional failure, WAT emission, allocation/GC tracking), but the logic density is high—correctness of memory layouts, byte-copy operations, ADT struct offsets, heap allocation order, GC shadowing paths, and state-machine control flow requires careful verification. The modularity reduces coupling but increases the breadth of review across diverse domains (array semantics, Decimal arithmetic, URL/base64 standards, exception control flow, etc.). Possibly related issues
Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #474 +/- ##
==========================================
+ Coverage 90.35% 90.37% +0.01%
==========================================
Files 50 58 +8
Lines 19548 19588 +40
Branches 225 225
==========================================
+ Hits 17663 17703 +40
Misses 1881 1881
Partials 4 4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 10
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@vera/wasm/calls_arrays.py`:
- Around line 467-475: The slice bounds are being narrowed with i32.wrap_i64
before clamping, allowing large positive i64 indices to wrap to negative i32
values; change the code that builds start_instrs/end_instrs so you perform the
bounds checks and clamp values while still i64 (compare to 0 and to the array
length as i64, compute min/max to clamp to [0,len] in i64), and only after
clamping emit i32.wrap_i64 and local.set {s} / local.set {e}; apply the same fix
for the other occurrence around the block handling lines 477-504 so all bounds
are clamped in i64 prior to narrowing to i32.
In `@vera/wasm/calls_containers.py`:
- Around line 136-160: The current fallback in _map_wasm_tag routes every
non-primitive/non-String type to the opaque-handle tag "b", but Array<T>
actually lowers to an i32_pair and must not be folded into 'b' (this breaks
map_insert/map_values for Map<K, Array<T>>). Update _map_wasm_tag to detect
Array types (e.g. type strings like "Array<...>") and explicitly reject them by
raising an error (TypeError or ValueError) with a clear message that Map values
may only be primitives or String for now; leave String -> "s" and primitives ->
"i"/"f" unchanged. Also ensure _map_wasm_types remains consistent (the "s" tag
returns ["i32","i32"]; the opaque 'b' still returns ["i32"]) so imported names
and translate_expr expectations stay aligned.
In `@vera/wasm/calls_encoding.py`:
- Around line 1426-1440: url_parse currently discards the local has_auth and
delimiter presence info that url_join needs to reconstruct URLs exactly; modify
url_parse to preserve authority and delimiter state (e.g., add a boolean flag
like scheme_has_authority derived from has_auth and separate flags for
query_delimiter_present and fragment_delimiter_present) instead of only
computing auth_start from colon_pos/has_auth, and populate those flags wherever
has_auth/colon_pos/auth_start are used (including the blocks around has_auth,
colon_pos, auth_start). Then update url_join to consult scheme_has_authority
when emitting the scheme separator (emit "://" only when true, otherwise ":"),
and use the query_delimiter_present and fragment_delimiter_present flags to
decide whether to emit "?" or "#" for empty vs absent query/fragment, ensuring
round-trip preservation; apply the same changes consistently where similar logic
appears.
- Around line 391-423: The decoding currently only counts trailing '=' into
local pad but still treats '=' as value 0 in the main decode loop; change the
decoder so any '=' found outside the canonical trailing padding positions causes
an error: after computing pad (locals pad, ptr, slen) ensure the main loop only
processes bytes at indices < slen - pad, and add an explicit check in the main
loop that if the loaded byte equals 61 ('=') and the current index >= (slen -
pad) allow it only as part of trailing padding, otherwise return Err; update
references to local variables pad, ptr, slen and the main loop that maps input
bytes to values to enforce this validation (also apply the same fix for the
similar block around lines 553-561).
In `@vera/wasm/calls_handlers.py`:
- Around line 302-307: The catch-arm result type inference only checks
expr.clauses[0].body when it's an ast.Block and ignores expression-bodied
handlers, which leaves result_wt None and omits the (result ...) in generated
WAT; update the logic so that when clause_body exists but is not an ast.Block
you still infer its result type (call the appropriate expression inference
helper or infer from the expression node instead of only using
_infer_block_result_type), assign that to result_wt (falling back to inferring
from expr.body if still None), and apply the same change to the other occurrence
with the same pattern (the block around the second instance at 347-352) so
non-block handler arms produce the correct (result ...) in the emitted
block/try_table.
In `@vera/wasm/calls_parsing.py`:
- Around line 80-90: The current loop that skips space (byte 32) uses local.get
{byte} and always branches back (br $lp_pn), which allows embedded spaces to be
ignored; update the parse_nat and parse_int generation to only skip spaces while
parsing has not started by introducing or using a "started" local (or check that
index/local {idx} equals its initial value) and change the space-handling block
so it only branches (br $lp_pn) when started==0 (i.e., before any digit/sign has
been consumed); ensure the same change is applied to the other occurrence around
lines 301-311 so embedded spaces like "12 34" or "- 5" are rejected after
parsing starts.
In `@vera/wasm/calls_strings.py`:
- Around line 435-447: The to_string routine incorrectly negates INT64_MIN
causing overflow and emitting no digits; update the code in the to_string path
(around the negation/negative-check where locals is_neg and val are used) to
detect the INT64_MIN constant (-9223372036854775808) and handle it specially by
either (a) writing the known string/digits for INT64_MIN, or (b) converting to
an unsigned magnitude before digit extraction (cast val to u64 then take
magnitude without signed negation) so digit loop sees the correct positive
magnitude; adjust the branches around the i64.lt_s/negation block (the code that
sets is_neg/local.set {val}) to implement this INT64_MIN branch or
unsigned-magnitude conversion and ensure the rest of the digit-emission logic
consumes the unsigned magnitude.
- Around line 811-823: The fractional rounding can produce 1_000_000 and drop
the carry (so 1.9999995 becomes 1.0); after computing the rounded fractional
value stored in frac_val, detect if it equals 1_000_000 and if so set frac_val
to 0 and increment the integer part (the fval floor/local holding the integer)
accordingly; update the instruction sequence around the lines that compute and
set frac_val (references: local.get {fval}, i64.trunc_f64_s, local.set
{frac_val}) to perform this check and adjust the integer part before emitting
the six-digit loop so the carry is preserved.
- Around line 181-201: Before calling $alloc or doing i32.wrap_i64, validate the
slice bounds using the original i64 values and len_s: ensure start and end are
non-negative, cap them to len_s (or return/reject if out-of-range), and ensure
end >= start; only after clamping/validation perform i32.wrap_i64, compute
new_len (end - start), then call $alloc and gc_shadow_push(dst). Update the
block that sets locals start, end, new_len and the subsequent allocation (the
locals named start, end, new_len, dst and the call $alloc / gc_shadow_push
calls) to do this pre-check/clamp and to bail out or normalize lengths before
allocation and copying.
- Around line 250-253: The code uses idx (allocated local "i32") converted from
the Python index and then performs i32.load8_u without bounds checks; update the
char_code implementation to validate idx is within [0, len_s) and non-negative
before the load: after computing/setting idx and before the i32.load8_u call,
insert checks comparing idx against len_s and branch to trap or return a defined
error/value if out of range (use the existing ptr_s, len_s, idx locals and the
function's failure/trap path). Apply the same range-check pattern to the
analogous block referenced at lines 262-273 so all loads from the string buffer
validate 0 <= idx < len_s before calling i32.load8_u.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 464aad6a-1112-4324-ae5d-646578a3ae88
📒 Files selected for processing (11)
pyproject.tomlvera/wasm/calls.pyvera/wasm/calls_arrays.pyvera/wasm/calls_containers.pyvera/wasm/calls_encoding.pyvera/wasm/calls_handlers.pyvera/wasm/calls_markup.pyvera/wasm/calls_math.pyvera/wasm/calls_parsing.pyvera/wasm/calls_strings.pyvera/wasm/context.py
|
The 168 lines of missing patch coverage are pre-existing uncovered lines in the original To verify: the overall compiler coverage gate (80%) is met, and the overall coverage percentage is unchanged by this PR (still 96%). mypy is clean, full test suite (3,242) passes, all 73 conformance programs and 30 examples pass. Improving coverage on these paths is worthwhile but belongs in separate PRs alongside the bug fixes tracked in #475 — adding tests here would break the "pure code motion" property of this refactor. |
Adds the aggregate tracking issue for 10 pre-existing bugs surfaced during review of the calls.py decomposition (#474). Co-Authored-By: Claude <noreply@anthropic.invalid>
Documentation updates that should have been in PR #474: - Bump version to 0.0.113 - CHANGELOG.md: add v0.0.113 entry for the refactor + reference to #475 (the aggregate issue tracking 10 pre-existing bugs surfaced in review) - HISTORY.md Stage 11: add v0.0.113 row; update by-the-numbers header to v0.0.113; bump commit and release totals - KNOWN_ISSUES.md: remove the #418 'Refactoring needed' row (done) - README.md: update version badge and release count - docs/index.html: update version - Regenerate site assets - Regenerate uv.lock for v0.0.113 Co-Authored-By: Claude <noreply@anthropic.invalid>
Track the CI improvement that would have caught the missing CHANGELOG entry on #474. Sized at 30-60 minutes; natural Stage 11 tooling win. Co-Authored-By: Claude <noreply@anthropic.invalid>
…478) Closes #478. Adds scripts/check_changelog_updated.py, which diffs the current branch against origin/main and fails if any substantive file (vera/, spec/, SKILL.md) was changed without a matching new entry in CHANGELOG.md. ## Files - scripts/check_changelog_updated.py: the check itself - tests/test_check_changelog_updated.py: 54 unit + end-to-end tests (classification, diff parsing, trailer detection, temp-repo integration covering substantive/exempt/label/trailer paths) ## Integration - .pre-commit-config.yaml: new 'check-changelog-updated' hook at 'pre-push' stage (not per-commit, to avoid feature-branch noise) - .github/workflows/ci.yml: new step in the 'lint' job with 'fetch-depth: 0' so origin/main is available for the diff; the step is skipped when the PR has a 'skip-changelog' label ## Escape hatches - Commit trailer 'Skip-changelog: <reason>' (Git-native, local + CI) - PR label 'skip-changelog' (CI-only) - CHANGELOG_CHECK_BASE=<ref> env var to override the base ref ## Documentation - CONTRIBUTING.md: new 'Pre-push hook: CHANGELOG enforcement' subsection documenting the rule, install command, and escape hatches; install instructions updated to 'pre-commit install --hook-type pre-push' - TESTING.md: hook count 23 -> 24, test count 3,253 -> 3,307, new row for test_check_changelog_updated.py - ROADMAP.md: test count 3,307 + v0.0.113 totals line - CHANGELOG.md: new '[Unreleased] Added' entry documenting the rule Motivation: #474 merged without a CHANGELOG entry, HISTORY.md update, or KNOWN_ISSUES.md cleanup because none of those files were in the commit — the existing hooks only run on files they touch. This check reverses the logic: run always, check that the PR diff is self-consistent. Co-Authored-By: Claude <noreply@anthropic.invalid>
Clean-up-day PR covering ten small documentation inconsistencies surfaced by a cross-reference audit of open issues against all documentation. No code changes. ### Issue hygiene - Close #418 manually (PR #474 lacked the auto-close keyword in its body, so GitHub didn't close the issue at merge time; done via gh issue close, separate from this PR) - Link #424 in ROADMAP Phase 3b (prose bullet was present but missing the issue reference) - Add #439 to Continuous > Verification depth table - Add #480 to Phase 4c next to #466 (iterative WASM for higher-order array ops; prerequisite for scaling #466) - Add #481 to Continuous > CI tooling table (auto-tag + auto-release on version bump; sibling to #478) - Remove #416 and #417 from SKILL.md limitations (both closed; following project convention of delete-not-strikethrough) ### Stale counts - README.md: 3,253 → 3,318 tests - HISTORY.md by-the-numbers v0.0.113 column: 3,253 → 3,318 ### vera/README.md compiler internals - calls.py row: 8,332 → 572 (dispatcher); add 8 mixin rows for calls_arrays, calls_containers, calls_encoding, calls_handlers, calls_markup, calls_math, calls_parsing, calls_strings - wasm/ total: 12,672 → 12,998 - codegen/ line counts refreshed for all 10 sub-modules - 'Files:' summary line: wasm/ (4,273 across 7 modules) → (12,998 across 17 modules); codegen/ (3,137) → (6,618) ### Unreleased CHANGELOG entry Added under Documentation heading so the check_changelog_updated hook is satisfied. Co-Authored-By: Claude <noreply@anthropic.invalid>
Summary
Split the 8,390-line
vera/wasm/calls.pyinto a small core dispatcher plus 8 domain-focused mixins, matching the refactoring plan in issue #418. Pure code motion — no behavioral changes. Prepares the ground for Stage 11, which adds ~40 new built-in functions (#463, #366, #466, #467, #470, #471) that would otherwise pile onto the monolith.Every commit keeps the tree green so bisection works.
Final structure
calls.pycalls_math.pycalls_markup.pycalls_arrays.pycalls_handlers.pycalls_containers.pycalls_parsing.pycalls_encoding.pycalls_strings.pycalls.pyshrunk by 93% (8,390 → 572 lines).Architecture
The
vera/wasm/package already used a mixin composition pattern —WasmContextincontext.pyinherited from 5 mixins (InferenceMixin, OperatorsMixin, CallsMixin, ClosuresMixin, DataMixin). This PR just adds 8 more to the inheritance list. Mixins communicate viaself.foo()and Python's MRO; no explicit imports between mixins.All shared state lives in
WasmContext.__init__, so the split doesn't change initialization. The pyproject.toml mypy override list was extended to cover the new mixin modules (same pattern as the existing wasm mixins).Verification
vera check+vera verifyCommits
🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes