Skip to content

Commit e54841f

Browse files
feat(parser): PAR020 actionable error for missing ';' between block statements
The #1 unactionable small-model failure in agent mode is the mirror of PAR017: a *missing* ';' between block statements. A model writes a { } block body and drops the separator (`pure func f() -> int { let n = length(s) if n > 0 ... }`), and the parser emitted a bare "PAR_UNEXPECTED_TOKEN: expected }, got if" with zero recovery signal — config_file_parser burned 66 agent turns on exactly this. The parser now emits PAR020 — "missing ';' between block statements (found `X` where `;` or `}` was expected)" with the concrete two-line fix and a docs link — when a block body (function-declaration path, parser_func.go) or block expression (parser_expr.go) is followed by a statement-starting token (let/letrec/if/match/ identifier) instead of ';'/'}'. Shared via missingBlockSemicolonError() + peekStartsBlockStatement(). PAR017 (extra ';') + PAR020 (missing ';') now bookend the whole ';'-confusion family — ~32% of local-qwen agent failures. Found via the M-AILANG-ERROR-QUALITY frequency analysis of 334 qwen trials. - TestPAR020_MissingBlockSemicolon: fires on the pattern; no false-positive on valid or single-expression blocks. - parser/elaborate/pipeline suites green; make verify-examples at baseline (181/5/2). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1 parent c9da064 commit e54841f

5 files changed

Lines changed: 169 additions & 3 deletions

File tree

changelogs/v0.10-current.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,17 @@
44
55
## [Unreleased]
66

7+
### Added — PAR020: actionable "missing `;` between block statements" error (M-AILANG-ERROR-QUALITY)
8+
9+
The #1 *unactionable* small-model failure in agent mode is the **mirror of PAR017**: a *missing* `;` between block statements. A model writes a `{ }` block body and drops the separator —
10+
```
11+
pure func f() -> int {
12+
let n = length(s) // ← no ';'
13+
if n > 0 then 1 else 0
14+
}
15+
```
16+
— and the parser emitted a bare `PAR_UNEXPECTED_TOKEN: expected }, got if` with **zero recovery signal**; `config_file_parser` burned **66 agent turns** on exactly this. The parser now emits **`PAR020`** — "missing ';' between block statements (found \`X\` where \`;\` or \`}\` was expected)" with the concrete two-line fix and a docs link — when a block body (function-declaration path, `parser_func.go`) or block expression (`parser_expr.go`) is followed by a statement-starting token (`let`/`letrec`/`if`/`match`/identifier) instead of `;`/`}`. PAR017 (extra `;` in `=`-body) + PAR020 (missing `;` in block) now bookend the whole `;`-confusion family — **~32% of local-qwen agent failures**. Guarded by `TestPAR020_MissingBlockSemicolon` (fires on the pattern; no false-positive on valid or single-expression blocks); parser/elaborate/pipeline suites green; `make verify-examples` at baseline. See [design_docs/planned/v0_24_0/m-ailang-error-quality-for-llm-iteration.md](../design_docs/planned/v0_24_0/m-ailang-error-quality-for-llm-iteration.md).
17+
718
### Changed — Dialect-traps card: sharpen the #1 small-model failure (M-AILANG-ERROR-QUALITY)
819

920
A frequency analysis of **334 local-qwen agent trials (44 failures)** found that **~36% of failures are a single family** — expression-body (`= expr`) vs block-body (`{ stmts }`) / statement-separator confusion — **dominated (20.5%) by the `func f() = let x = e; rest` reflex** (`PAR017`: `;` not valid in expression-body functions). The dialect-traps card (`prompts/agent/dialect-traps.md`, M-EVAL-PROMPT-DELIVERY) trap #2 was sharpened to name that exact anti-pattern and give both fixes — brace block (`func f() { let x = e; rest }`) or let-in (`func f() = let x = e in rest`); verified the anti-pattern rejects and both fixes run. Notably `match … with` (`PAR019`) and `++`-for-string-concat — the old *big-model* top failures — are now rare/zero on qwen, confirming the card already works for those; the small-model frequency banners on the sibling `m-prompt-*` docs undercount what's still live for the models we run continuously. Data recorded in [design_docs/planned/v0_24_0/m-ailang-error-quality-for-llm-iteration.md](../design_docs/planned/v0_24_0/m-ailang-error-quality-for-llm-iteration.md), which re-prioritizes it: the parser/card already cover `PAR017`, yet the model violates it and can't recover (`config_file_parser` thrashed **66 turns**) — the remaining lever is making `PAR017` recovery-actionable (suggest the two rewrites inline).

design_docs/planned/v0_24_0/m-ailang-error-quality-for-llm-iteration.md

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,9 +54,31 @@ Two observations sharpen the priority:
5454
a never-passing benchmark has no baseline, so it thrashes unbounded (the 66-turn case).
5555

5656
Interim mitigation already shipped: the dialect-traps card's trap #2 was sharpened to name the
57-
exact `func f() = let x = e; rest` reflex and give both fixes (brace block / `let … in`). The
58-
remaining lever — making `PAR017` itself recovery-actionable (suggest the two concrete rewrites
59-
inline) — is in scope for this sprint.
57+
exact `func f() = let x = e; rest` reflex and give both fixes (brace block / `let … in`).
58+
59+
### Shipped 2026-06-10 — PAR020 (the missing-`;` mirror)
60+
61+
Investigating the corpus showed the real *unactionable* thrash-causer is **not** PAR017 (which a
62+
prior iteration already made fully actionable — both fixes + docs link inline) but its **mirror**:
63+
a *missing* `;` between block statements. The model writes a `{ }` block body and drops the `;`:
64+
65+
```
66+
pure func f() -> int {
67+
let n = length(s) // ← no ';'
68+
if n > 0 then 1 else 0
69+
}
70+
```
71+
72+
The parser emitted a bare `PAR_UNEXPECTED_TOKEN: expected }, got if` — zero recovery signal;
73+
`config_file_parser` burned **66 agent turns** on exactly this. Fixed: when a block body / block
74+
expression is followed by a statement-starting token (`let`/`letrec`/`if`/`match`/identifier)
75+
instead of `;` or `}`, the parser now emits **`PAR020`** — "missing ';' between block statements"
76+
with the concrete two-line fix and a docs link — on both the function-declaration body path
77+
(`parser_func.go`) and the block-expression path (`parser_expr.go`). Guarded by
78+
`TestPAR020_MissingBlockSemicolon` (fires on the pattern; no false-positive on valid or
79+
single-expression blocks). PAR017 (extra `;`) + PAR020 (missing `;`) now bookend the whole `;`
80+
confusion family — **~32% of qwen failures**. Next-attempt recovery rate is measurable on the
81+
nightly's same-rotation re-run.
6082

6183
## Methodology
6284

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
package parser
2+
3+
import (
4+
"strings"
5+
"testing"
6+
7+
"github.com/sunholo-data/ailang/internal/lexer"
8+
)
9+
10+
// TestPAR020_MissingBlockSemicolon guards M-AILANG-ERROR-QUALITY: a missing ';'
11+
// between block statements (the mirror of PAR017's "extra ';' in =-body") must
12+
// produce an actionable PAR020 error naming the fix, not a bare
13+
// "expected }, got X". This is the #1 unactionable thrash-causer on small models
14+
// — config_file_parser burned 66 agent turns on a generic "expected }, got if".
15+
func TestPAR020_MissingBlockSemicolon(t *testing.T) {
16+
t.Run("func body missing semicolon (the 66-turn pattern)", func(t *testing.T) {
17+
input := "module t\n" +
18+
"pure func f(n: int) -> int {\n" +
19+
" let x = n\n" + // <-- missing ';'
20+
" if x > 0 then 1 else 0\n" +
21+
"}\n"
22+
err := firstParserErrorWithCode(t, input, "PAR020")
23+
if err == nil {
24+
t.Fatal("expected PAR020 for a missing ';' between block statements")
25+
}
26+
msg := err.Error()
27+
if !strings.Contains(msg, "missing ';'") {
28+
t.Errorf("message should name the missing ';': %s", msg)
29+
}
30+
if !strings.Contains(msg, "separated by `;`") {
31+
t.Errorf("message should explain block-statement separation: %s", msg)
32+
}
33+
if len(err.Suggestions) == 0 {
34+
t.Error("PAR020 should carry concrete fix suggestions")
35+
}
36+
})
37+
38+
t.Run("missing semicolon before another let", func(t *testing.T) {
39+
input := "module t\n" +
40+
"pure func f(n: int) -> int {\n" +
41+
" let x = n\n" + // <-- missing ';'
42+
" let y = x\n" +
43+
" y\n" +
44+
"}\n"
45+
if firstParserErrorWithCode(t, input, "PAR020") == nil {
46+
t.Error("expected PAR020 when a let-statement is followed by another without ';'")
47+
}
48+
})
49+
50+
t.Run("valid block does NOT trigger PAR020", func(t *testing.T) {
51+
input := "module t\n" +
52+
"pure func f(n: int) -> int { let x = n; let y = x + 1; y }\n"
53+
if err := firstParserErrorWithCode(t, input, "PAR020"); err != nil {
54+
t.Errorf("valid semicolon-separated block must not trigger PAR020: %s", err.Error())
55+
}
56+
})
57+
58+
t.Run("single-expression body does NOT trigger PAR020", func(t *testing.T) {
59+
input := "module t\n" +
60+
"pure func f(n: int) -> int { n + 1 }\n"
61+
if err := firstParserErrorWithCode(t, input, "PAR020"); err != nil {
62+
t.Errorf("single-expression body must not trigger PAR020: %s", err.Error())
63+
}
64+
})
65+
}
66+
67+
// firstParserErrorWithCode parses input and returns the first *ParserError whose
68+
// Code matches, or nil if none.
69+
func firstParserErrorWithCode(t *testing.T, input, code string) *ParserError {
70+
t.Helper()
71+
l := lexer.New(input, "<test>")
72+
p := New(l)
73+
_ = p.Parse()
74+
for _, e := range p.errors {
75+
if pe, ok := e.(*ParserError); ok && pe.Code == code {
76+
return pe
77+
}
78+
}
79+
return nil
80+
}

internal/parser/parser_expr.go

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -406,6 +406,17 @@ func (p *Parser) parseBlockOrExpression() ast.Expr {
406406
exprs = append(exprs, p.parseExpression(LOWEST))
407407
}
408408

409+
// M-AILANG-ERROR-QUALITY (PAR020): if the next token instead begins a new
410+
// statement (let / if / match / identifier), the block is missing a ';'
411+
// between statements — the mirror of PAR017's "extra ';' in =-body". This is
412+
// the #1 unactionable thrash-causer on small models: config_file_parser burned
413+
// 66 agent turns on a bare "expected }, got if". Give the concrete fix instead.
414+
if !p.peekTokenIs(lexer.RBRACE) && p.peekStartsBlockStatement() {
415+
p.errors = append(p.errors, p.missingBlockSemicolonError())
416+
p.traceDelimiterStack()
417+
return nil
418+
}
419+
409420
// Expect closing brace
410421
if !p.expectPeek(lexer.RBRACE) {
411422
p.errors = append(p.errors, fmt.Errorf("expected '}' to close function body at %s", p.peekToken.Position()))
@@ -428,6 +439,40 @@ func (p *Parser) parseBlockOrExpression() ast.Expr {
428439
}
429440
}
430441

442+
// peekStartsBlockStatement reports whether the peek token unambiguously begins a
443+
// new block statement (let / letrec / if / match / identifier). Used to detect a
444+
// missing ';' separator between block statements (PAR020). Kept to high-signal
445+
// statement-starters so the hint is precise — literals/operators are excluded
446+
// because they're more likely a mid-expression parse error than a dropped ';'.
447+
func (p *Parser) peekStartsBlockStatement() bool {
448+
switch p.peekToken.Type {
449+
case lexer.LET, lexer.LETREC, lexer.IF, lexer.MATCH, lexer.IDENT:
450+
return true
451+
default:
452+
return false
453+
}
454+
}
455+
456+
// missingBlockSemicolonError builds the PAR020 actionable error for a missing
457+
// ';' between block statements (the mirror of PAR017's "extra ';' in =-body").
458+
// Used by both the block-expression parser and the function-declaration
459+
// block-body parser. Points at the offending (peek) token.
460+
func (p *Parser) missingBlockSemicolonError() *ParserError {
461+
return NewSuggestionError(
462+
"PAR020",
463+
ast.Pos{Line: p.peekToken.Line, Column: p.peekToken.Column, File: p.peekToken.File},
464+
p.peekToken,
465+
fmt.Sprintf("missing ';' between block statements (found `%s` where `;` or `}` was expected)", p.peekToken.Literal),
466+
[]string{
467+
"Statements inside a `{ }` block are separated by `;`:",
468+
" { let x = e1; let y = e2; result }",
469+
"Add a `;` after the previous statement, before this one.",
470+
"The block's LAST expression is the return value — no `;` after it.",
471+
},
472+
"https://ailang.sunholo.com/docs/reference/language-syntax",
473+
)
474+
}
475+
431476
// parseRecordLiteralContent / parseRecordUpdateContent moved to parser_record.go
432477
// (M-RELEASE-GATE follow-up: keep parser_expr.go under the 800-line limit).
433478

internal/parser/parser_func.go

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,14 @@ func (p *Parser) parseFunctionDeclaration(isPure bool, isExport bool) *ast.FuncD
199199
}
200200
// Parse body as a block (semicolon-separated expressions)
201201
fn.Body = p.parseFunctionBody()
202+
// M-AILANG-ERROR-QUALITY (PAR020): a statement-starting token here means
203+
// the block is missing a ';' separator (mirror of PAR017's extra ';').
204+
// This is the #1 unactionable thrash-causer on small models —
205+
// config_file_parser burned 66 agent turns on a bare "expected }, got if".
206+
if !p.peekTokenIs(lexer.RBRACE) && p.peekStartsBlockStatement() {
207+
p.errors = append(p.errors, p.missingBlockSemicolonError())
208+
return nil
209+
}
202210
if !p.expectPeek(lexer.RBRACE) {
203211
return nil
204212
}

0 commit comments

Comments
 (0)