Implement eight string/conversion built-in operations#173
Conversation
Add runtime support for the three string operations specified in Chapter 4, Section 4.13 of the language spec. These are the first dynamic string operations in the compiler — prior to this, only string constants were supported. Implementation: - Register string_length, string_concat, string_slice as built-in functions in the type checker environment - Add WASM codegen for all three operations using the existing bump allocator ($alloc) and byte-copy loops for string_concat/string_slice - Add return type inference for string operations in both WASM and Vera type inference paths - Add string built-ins to the known-functions whitelist in the cross-module call scanner Tests: 18 new tests (5 checker + 13 codegen), all passing. Example: examples/string_ops.vera demonstrates all three operations. Partial progress on aallan#52 and aallan#134. Co-Authored-By: Claude <noreply@anthropic.invalid>
…uilt-ins Add five more string/conversion built-in functions to complete the set needed for string-processing Vera programs: - char_code(String, Int -> Nat): returns byte value at given index - parse_nat(String -> Nat): parses decimal string to natural number, skipping leading spaces - parse_float64(String -> Float64): parses decimal string with optional sign and decimal point to 64-bit float - to_string(Int -> String): converts integer to decimal string representation (handles negatives) - strip(String -> String): trims leading/trailing ASCII whitespace (space, tab, CR, LF); returns a view without allocation Implementation details: - parse_float64 handles sign, integer part, and fractional part using f64 arithmetic in WASM - to_string uses a 20-byte temp buffer with reverse digit extraction - strip returns a pointer into the original string (zero-copy) Tests: 34 new tests (5 checker + 29 codegen), all passing. Updated string_ops.vera example to demonstrate all eight string built-ins. Further progress on aallan#52 and aallan#134. Co-Authored-By: Claude <noreply@anthropic.invalid>
|
Hi Alasdair! Hot-wired indeed! |
aallan
left a comment
There was a problem hiding this comment.
Thanks so much for this! The implementation follows all the project's patterns perfectly, the test coverage is thorough, and the WASM codegen is clean and well-commented. The strip zero-copy approach is particularly clever.
I'm going to push a small commit to align two type signatures with the spec:
string_length will returns Nat (was Int) — spec §4.13 specifies non-negative return
string_slice will takes Nat indices (was Int) — spec §4.13 specifies non-negative positions
One other spec discrepancy: parse_nat should return Result<Nat, String> per spec §9 rather than bare Nat. That's a bigger change — it requires new codegen infrastructure for built-in functions returning ADTs, plus digit validation and error paths — so I'll open an issue to track it separately rather than hold up this PR.
I'll handle the version bump, CHANGELOG, docs updates, and spec updates in a follow-up PR after merging this. Really appreciate the contribution!
- string_length: return type INT → NAT (spec §4.13 says non-negative) - string_slice: param types INT, INT → NAT, NAT (spec §4.13 says non-negative positions) - Update test_string_slice_ok to use @nat params Co-Authored-By: Claude <noreply@anthropic.invalid>
Version bump and documentation updates following PR #173 (eight string/conversion built-in operations by @rlseaman). - Version 0.0.49 → 0.0.50 - CHANGELOG: document all 8 operations, note parse_nat limitation (#174) - spec/04-expressions.md §4.13: list all 8 operations with signatures, remove "Not yet implemented" banner, add parse_nat Result note - SKILLS.md: new "Built-in Functions" section with string operations - vera/README.md: add 8 string functions to built-ins table - README.md: update test/example counts, strike #134, add #174 to roadmap - TESTING.md: update test counts (1,267 tests, 15 examples) - CLAUDE.md: update example count (14 → 15) - CONTRIBUTING.md: add spec alignment guidance for built-in functions - Fix README allowlist line numbers (scripts + test) Co-Authored-By: Claude <noreply@anthropic.invalid>
All 8 string built-in operations (string_concat, to_string, string_slice, strip, parse_nat, parse_float64, char_code, string_length) were implemented in WASM codegen across v0.0.50 (PR #173) and v0.0.60 (#174). This updates limitation tables in spec/11, spec/12, README, and vera/README to reflect that #52 is complete. Also fixes stale #53 and #110 rows in spec/12-runtime to match spec/11-compilation. GC for string memory remains in #51. Co-Authored-By: Claude <noreply@anthropic.invalid>
All 8 string built-in operations (string_concat, to_string, string_slice, strip, parse_nat, parse_float64, char_code, string_length) were implemented in WASM codegen across v0.0.50 (PR #173) and v0.0.60 (#174). This updates limitation tables in spec/11, spec/12, README, and vera/README to reflect that #52 is complete. Also fixes stale #53 and #110 rows in spec/12-runtime to match spec/11-compilation. GC for string memory remains in #51. Co-Authored-By: Claude <noreply@anthropic.invalid>
All 8 string built-in operations (string_concat, to_string, string_slice, strip, parse_nat, parse_float64, char_code, string_length) were implemented in WASM codegen across v0.0.50 (PR #173) and v0.0.60 (#174). This updates limitation tables in spec/11, spec/12, README, and vera/README to reflect that #52 is complete. Also fixes stale #53 and #110 rows in spec/12-runtime to match spec/11-compilation. GC for string memory remains in #51. Co-Authored-By: Claude <noreply@anthropic.invalid>
Summary
Implements all eight string/conversion built-in operations needed for string-processing Vera programs. These are the first dynamic string operations in the compiler — prior to this, only string constants (literals) were supported.
Operations added
string_lengthString -> Intstring_concatString, String -> Stringstring_sliceString, Int, Int -> Stringchar_codeString, Int -> Natparse_natString -> Natparse_float64String -> Float64to_stringInt -> StringstripString -> StringUses the existing bump allocator (
$alloc) for heap allocation. No GC dependency.Notable design choices
stripreturns a zero-copy view into the original string (no allocation)parse_float64handles optional sign, integer part, and fractional partto_stringhandles negative numbers; uses a 20-byte reverse-digit bufferparse_natskips leading spaces for compatibility with fixed-width format parsingFiles changed
vera/environment.pyvera/wasm/calls.pyvera/wasm/inference.pyvera/codegen/modules.pytests/test_checker.pytests/test_codegen.pyexamples/string_ops.veraRelated Issues
Partial progress on #52 and #134.
Type of Change
Checklist
Local validation
vera checkandvera verifyvera/parse_nat(to_string(123))round-trips correctlystrip(string_slice(...))chains correctly for fixed-width field extraction🤖 Generated with Claude Code