Summary
Vera's string built-ins cover the most common operations but are missing several utilities that come up frequently in text processing and formatting. Two distinct gaps:
- Character-level ops (
string_reverse, string_pad_*, string_chars, string_trim_*) — close the per-character recursion pattern.
- Structural splits (
string_lines, string_words) — string_split only takes a single delimiter character, so splitting on line endings (handling \r\n) or whitespace runs requires regex today.
Proposed API
Transformation
string_reverse(@String) -> @String — reverse a string (Unicode-aware, by grapheme cluster)
string_pad_start(@String, @Nat, @String) -> @String — pad to target length with fill string (left-pad)
string_pad_end(@String, @Nat, @String) -> @String — pad to target length with fill string (right-pad)
Decomposition
string_chars(@String) -> @Array<String> — split into individual characters (grapheme clusters) as an array of single-character strings. Bridge primitive: once this exists, string operations become array operations (e.g. string_reverse = string_from_chars(array_reverse(string_chars(s))), character predicates become array_any(string_chars(s), is_digit)).
Structural splits
string_lines(@String) -> @Array<String> — split on line terminators (\n, \r\n, \r). Mirrors Python's str.splitlines(), Rust's str::lines(), Java's String.lines(). Current workaround requires string_split("\n") and manual \r stripping — LLMs get this wrong on Windows-style line endings.
string_words(@String) -> @Array<String> — split on whitespace runs (spaces, tabs, newlines), discarding empty segments. Mirrors Python's str.split() with no argument. Currently impossible with string_split (single-delimiter only); requires regex.
Trimming variants
string_trim_start(@String) -> @String — remove leading whitespace only
string_trim_end(@String) -> @String — remove trailing whitespace only
Note: string_strip already exists for trimming both sides.
Implementation
- environment.py: Register as pure functions
- codegen/api.py: Host imports delegating to Python string methods (
.ljust(), .rjust(), .lstrip(), .rstrip(), list() for chars, .splitlines() for lines, .split() with no arg for words)
- Browser runtime:
padStart(), padEnd(), trimStart(), trimEnd(), spread operator for chars, regex-based split for lines/words
- Verification:
string_reverse preserves length
string_pad_start / string_pad_end: result length >= input length
string_trim_*: result length <= input length
string_chars: result length = string_length of input
string_lines / string_words: concatenating results (with separator) recovers a prefix of the input
Priority
Medium. Less urgent than array utilities, sleep, and random. Within this issue, priority order:
string_chars — bridge primitive; unlocks array-combinator approach to string processing
string_lines / string_words — real capability gaps; string_split can't currently do these correctly
string_pad_start / string_pad_end — common formatting need
string_reverse, string_trim_start, string_trim_end — convenience, once string_chars exists several of these become one-liners
Summary
Vera's string built-ins cover the most common operations but are missing several utilities that come up frequently in text processing and formatting. Two distinct gaps:
string_reverse,string_pad_*,string_chars,string_trim_*) — close the per-character recursion pattern.string_lines,string_words) —string_splitonly takes a single delimiter character, so splitting on line endings (handling\r\n) or whitespace runs requires regex today.Proposed API
Transformation
string_reverse(@String) -> @String— reverse a string (Unicode-aware, by grapheme cluster)string_pad_start(@String, @Nat, @String) -> @String— pad to target length with fill string (left-pad)string_pad_end(@String, @Nat, @String) -> @String— pad to target length with fill string (right-pad)Decomposition
string_chars(@String) -> @Array<String>— split into individual characters (grapheme clusters) as an array of single-character strings. Bridge primitive: once this exists, string operations become array operations (e.g.string_reverse = string_from_chars(array_reverse(string_chars(s))), character predicates becomearray_any(string_chars(s), is_digit)).Structural splits
string_lines(@String) -> @Array<String>— split on line terminators (\n,\r\n,\r). Mirrors Python'sstr.splitlines(), Rust'sstr::lines(), Java'sString.lines(). Current workaround requiresstring_split("\n")and manual\rstripping — LLMs get this wrong on Windows-style line endings.string_words(@String) -> @Array<String>— split on whitespace runs (spaces, tabs, newlines), discarding empty segments. Mirrors Python'sstr.split()with no argument. Currently impossible withstring_split(single-delimiter only); requires regex.Trimming variants
string_trim_start(@String) -> @String— remove leading whitespace onlystring_trim_end(@String) -> @String— remove trailing whitespace onlyNote:
string_stripalready exists for trimming both sides.Implementation
.ljust(),.rjust(),.lstrip(),.rstrip(),list()for chars,.splitlines()for lines,.split()with no arg for words)padStart(),padEnd(),trimStart(),trimEnd(), spread operator for chars, regex-based split for lines/wordsstring_reversepreserves lengthstring_pad_start/string_pad_end: result length >= input lengthstring_trim_*: result length <= input lengthstring_chars: result length =string_lengthof inputstring_lines/string_words: concatenating results (with separator) recovers a prefix of the inputPriority
Medium. Less urgent than array utilities, sleep, and random. Within this issue, priority order:
string_chars— bridge primitive; unlocks array-combinator approach to string processingstring_lines/string_words— real capability gaps;string_splitcan't currently do these correctlystring_pad_start/string_pad_end— common formatting needstring_reverse,string_trim_start,string_trim_end— convenience, oncestring_charsexists several of these become one-liners