Pre-existing bug surfaced by CodeRabbit on PR #567 (v0.0.129). Not in scope for #475 PR 2 (which addressed seven specific Major findings); filed separately for follow-up.
Reproduction
public fn main(@Unit -> @Unit)
requires(true) ensures(true) effects(<IO>)
{
match url_parse(":foo") {
Ok(@UrlParts) -> IO.print(url_join(@UrlParts.0)),
Err(@String) -> IO.print("ERR")
}
}
Expected: either round-trips as :foo or url_parse rejects the input as not RFC 3986-compliant (a URL scheme must start with ALPHA per § 3.1).
Actual: url_parse returns Ok with scheme="" (empty), path="foo"; url_join then drops the colon (because its scheme delimiter emit is gated on s_len > 0 in both explicit-mode and legacy-mode branches) and produces foo.
Root cause
In vera/wasm/calls_encoding.py _translate_url_parse:
- The colon scan loop sets
colon_pos = 0 for input :foo (colon found at position 0)
s_len = colon_pos = 0 (empty scheme)
- The flag word at offset 44 captures has_auth / has_query / has_frag but not whether a colon was present
In _translate_url_join, the scheme delimiter emit branch checks s_len > 0. With s_len = 0 no : is written.
Fix options
- RFC-compliant rejection:
url_parse rejects colon_pos == 0 (returns Err). Behavior change — programs that previously got Ok for :foo would get Err. Simpler.
- Add
has_colon flag: extend the offset-44 packed-flag word with a fourth bit (the explicit-mode sentinel currently lives at bit 3; has_colon would go at bit 4) so url_join can emit : whenever url_parse saw one, independent of s_len. Preserves current Ok/Err behavior of url_parse; only fixes round-trip.
Option 2 is more conservative; option 1 is more correct vs RFC 3986.
Verified pre-existing
Reproduced on main at 448924c (vera/wasm/calls_encoding.py decomposition commit) — same foo output.
References
Pre-existing bug surfaced by CodeRabbit on PR #567 (v0.0.129). Not in scope for #475 PR 2 (which addressed seven specific Major findings); filed separately for follow-up.
Reproduction
Expected: either round-trips as
:fooor url_parse rejects the input as not RFC 3986-compliant (a URL scheme must start with ALPHA per § 3.1).Actual: url_parse returns Ok with scheme="" (empty), path="foo"; url_join then drops the colon (because its scheme delimiter emit is gated on
s_len > 0in both explicit-mode and legacy-mode branches) and producesfoo.Root cause
In
vera/wasm/calls_encoding.py_translate_url_parse:colon_pos = 0for input:foo(colon found at position 0)s_len = colon_pos = 0(empty scheme)In
_translate_url_join, the scheme delimiter emit branch checkss_len > 0. Withs_len = 0no:is written.Fix options
url_parserejectscolon_pos == 0(returns Err). Behavior change — programs that previously got Ok for:foowould get Err. Simpler.has_colonflag: extend the offset-44 packed-flag word with a fourth bit (the explicit-mode sentinel currently lives at bit 3;has_colonwould go at bit 4) so url_join can emit:whenever url_parse saw one, independent ofs_len. Preserves current Ok/Err behavior of url_parse; only fixes round-trip.Option 2 is more conservative; option 1 is more correct vs RFC 3986.
Verified pre-existing
Reproduced on
mainat448924c(vera/wasm/calls_encoding.pydecomposition commit) — samefoooutput.References
scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )