SKILL.md documentation sweep + bug tracking from Game of Life agent run (#513)#518
Conversation
Two documentation gaps surfaced by the same agent writing Conway's Game of Life — the same agent whose earlier feedback drove the Stage 11 primitives push. Both features WORK; they just weren't in SKILL.md, so an agent reading only SKILL.md found workarounds where idiomatic solutions exist. Agent friction #1 (array literals): "I don't see direct array literal syntax. I could use array_range(0, 0) to get an empty array, but that returns Array<Int> and I need Array<Bool>. Looking more carefully at the skill documentation, I see [] is just an index operator, not an array literal." Agent friction #2 (closure capture): "Closures seem to only reference their own parameters through De Bruijn indices, with no clear way to access outer bindings. Since the documentation doesn't explicitly cover closure capture semantics, I'll stick with recursion as the safer approach." Both correctly followed the (incomplete) documentation to the wrong conclusion. Additions: - New "Array literals" subsection under "Composite types": shows [], [1, 2, 3], type inference, nested arrays, and the context-disambiguation between literal-[] and postfix-index-[]. Verified with a live smoke test — Array<Bool> = [] followed by = [true, false, true] compiles and returns length 3. - New "Closures and captured bindings" section between Iteration and Built-in Functions, with: (a) syntax example, (b) worked fold example capturing an outer `let @int = 100` at Int.2 (smoke-tested returning 306), (c) explicit rule for counting De Bruijn shifts by the closure's own parameters of each type, (d) the --explain-slots tip, (e) a "when to use recursion instead" section listing the four cases closures don't cover (non-pure effect rows, early-return, non-linear iteration, termination proofs requiring `decreases`). Allowlist bookkeeping: - Added 3 new FRAGMENT entries (my two new bare-let blocks plus md_parse signatures that shifted). - Updated 2 shifted entries (handle syntax template 1388→1459, import aliasing 1878→1949). No code changes. No spec changes. No test changes beyond allowlist-shift in scripts/check_skill_examples.py. Branch kept open to collect further agent-friction gaps as they arrive — this is a small PR that can grow by one commit per signal until the agent stops hitting walls. Co-Authored-By: Claude <noreply@anthropic.invalid>
, #514) Second wave of agent-friction documentation, plus a filed compiler bug (#514) the agent surfaced during the same Game of Life run. SKILL.md additions: - **String escape sequences table**: full set (\\n, \\t, \\r, \\0, \\\\, \\", \\u{XXXX}). Call out \\x / \\v / \\f etc. NOT supported, with two documented workarounds (unicode escape or string_from_char_code). Confirms raw UTF-8 literals work ("██" compiles and prints). Triggered by the agent trying \\x1b for an ANSI sequence and needing to discover string_from_char_code by inspection. - **Nullary vs Unit-taking functions**: both `(-> T)` and `(@Unit -> T)` are valid; call sites must match the arity exactly. Agent noticed both in the codebase and wondered which was "right". Worked examples show both compile and run. - **Stored function values and apply_fn**: the agent discovered apply_fn by inspecting prelude.py; it was never in SKILL.md. Added a subsection with a worked example and a pointer to examples/closures.vera and ch05_closures.vera for the full shape. - **Known Bugs and Workarounds section** near the end of SKILL.md: new, dedicated table with the three classes of runtime bugs agents are most likely to hit — nested closures (#514), the 10 pre-existing translator bugs (#475), and the 64 KB single-alloc limit (#487). Each entry gives a shape, a concrete workaround, and the tracking issue. Terminal sentence pins the diagnostic heuristic: "type-checks cleanly + mysterious runtime trap = probably #514 in a new shape." - **Closure known-limitation subsection**: expanded to cover the two distinct failure modes (compile-time nested; runtime captured-scalar-through-indirection) with the agent's refined isolation matrix (see #514 comment). Non-SKILL updates: - KNOWN_ISSUES.md: new #514 row under Bugs covering both failure modes and the workaround. - scripts/check_skill_examples.py: clean allowlist rewrite, 66 unique entries, AST-verified zero duplicate keys. The previous state had 7 duplicate keys from fix_allowlists shifting entries into collision — memory feedback_spec_allowlist.md covers this failure mode. Full validation green: 54 parsed + 66 allowlisted = 120 SKILL.md blocks handled, 0 failures; doc counts consistent; site assets up-to-date; 19 doc-related tests pass. Co-Authored-By: Claude <noreply@anthropic.invalid>
Two new issues surfaced by the same agent run: - #515: $gc_collect itself faults with out-of-bounds memory access under sustained allocation pressure. 40×20 Conway's Game of Life over 200 generations reliably reproduces. The collector walks past $heap_ptr to the linear-memory bound and traps — gc_collect is the top frame of the crashing stack, not the program. This is distinct from #487 (alloc path) and #484 (sweeper 16-bit truncation). - #516: Runtime traps bubble up as raw wasmtime stack traces; the CLI mis-labels every Trap/WasmtimeError as "Runtime contract violation" even when the actual cause is out-of-bounds memory access, integer overflow, unreachable, etc. No source-line mapping, no actionable guidance — the exact opposite of the carefully crafted compile-time diagnostics. Three-stage proposed scope: categorise trap reason, source-map the Vera function that trapped, specialise help for common trap classes. KNOWN_ISSUES.md updated with rows for both. ROADMAP.md implementation-order table re-sorted: 1. #514 (nested closures + captured-scalar indirection) 2. #515 (GC collect faults) 3. #516 (runtime trap diagnostics) 4. #475 (WASM translator bug cleanup — previously #1) 5. #507 (Eq/Ord-dispatched array ops) The top three are all surfaced by the single Game of Life agent run — the empirical signal is the clearest prioritisation we've had. #475 was promoted to #1 after PR #511 merged; pushed to #4 as the new issues outrank it on agent-adoption impact. Agent-noted quote, worth capturing as the motivation for #516 specifically: "the gap between 'the type system is happy' and 'the compiled artefact actually runs' is wider than you'd expect from a language with SMT-verified contracts. The verifier can prove your termination argument is sound while the codegen silently miscompiles your closure environment out from under you." Also posted comment on #514 documenting the third shape the agent is now isolating (closure body with i32/i64 type mismatch, distinct from direct-nesting and captured-scalar-indirection). No code changes. Co-Authored-By: Claude <noreply@anthropic.invalid>
Fourth round of agent isolation has narrowed #514 from "nested closures" + "captured scalar through array_map" to a single root cause: **Closures can capture primitive outer bindings (Int, Nat, Bool, Byte, Float64) but NOT heap-allocated ones (String, Array<T>, any ADT, opaque handles).** The previously documented shapes were narrow manifestations of the same bug. vera check and vera compile both succeed for heap- capture shapes; vera run fails with "unknown table 0" or "i32/i64 type mismatch" at WASM validation. Changes: - SKILL.md "Capturing outer bindings" rewritten to lead with the limitation, enumerate which types work and which don't, and point at the recursion-with-explicit-parameters workaround. A new "What you cannot capture" subsection lists every heap type. A "Workaround: tail recursion with explicit parameters" subsection shows the lifted-function pattern that dodges the bug (demonstrated end-to-end in examples/life.vera once that example lands). - KNOWN_ISSUES.md #514 row rewritten with the sharpened root cause, the full list of failing capture types, the primitive- capture allowlist, and the two observed validation error shapes. - #514 on the issue tracker updated with the fourth-round isolation matrix and an investigation pointer at vera/wasm/closures.py._compile_lifted_closure's environment struct emission (my hypothesis: environment field layout treats all captures as i64 scalars, fine for primitives, wrong for pair-shaped or ADT values). - scripts/check_skill_examples.py allowlist regenerated via fix_allowlists, then manually de-duplicated (two line-number collisions from the shift) and extended with 3 new FRAGMENT entries for the BROKEN-example blocks in the updated closure section. AST-verified 68 unique keys, 0 dups. Ready to ship as the first commit of the docs-only PR (#513). More signals can land as additional commits before merge if the agent run surfaces further gaps. Co-Authored-By: Claude <noreply@anthropic.invalid>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughDocumentation and tracking updates: SKILL.md expanded with array literals, closures, strings and a Known Bugs table; KNOWN_ISSUES.md adds four tracked bugs ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Fourth bug surfaced during the Game of Life agent run: Vera's compiler doesn't emit WASM return_call in tail positions, so tail-recursive functions — the documented for/while replacement — blow the call stack at ~tens of thousands of frames. Added: - KNOWN_ISSUES.md: new row under Bugs describing the missing TCO and the SKILL.md idiom-vs-reality gap. - ROADMAP.md: #517 inserted at position 4 in the short-term implementation-order table; #475 pushed to #5, #507 to #6. No code changes. Co-Authored-By: Claude <noreply@anthropic.invalid>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #518 +/- ##
=======================================
Coverage 91.03% 91.03%
=======================================
Files 58 58
Lines 21964 21964
Branches 259 259
=======================================
Hits 19995 19995
Misses 1962 1962
Partials 7 7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@SKILL.md`:
- Around line 716-760: Duplicate example blocks showing the same "BROKEN"
nested-closure and "WORKING" named-helper pattern were accidentally left in the
file; remove the redundant second block so only one copy of the example remains
(keep the example that defines fill_row and build_grid and the
array_map/array_range snippets), ensuring references to fill_row, build_grid,
array_map and array_range remain intact and formatting/```vera fences are
consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f6baf12d-50f0-48f6-9b5d-0c336c53c26b
⛔ Files ignored due to path filters (2)
docs/SKILL.mdis excluded by!docs/**docs/llms-full.txtis excluded by!docs/**
📒 Files selected for processing (4)
KNOWN_ISSUES.mdROADMAP.mdSKILL.mdscripts/check_skill_examples.py
Pre-commit's check_changelog_updated.py requires CHANGELOG entries whenever SKILL.md is modified. Add a Documentation + Tracked-bugs block under [Unreleased] summarising the eight SKILL sections added/rewritten and the four bugs filed (#514, #515, #516, #517) from the Game of Life agent run. Co-Authored-By: Claude <noreply@anthropic.invalid>
CodeRabbit caught two copies of the same BROKEN/WORKING nested-closure example pair — one labeled (1) and one unlabeled but otherwise identical. Dropped the unlabeled duplicate; kept the (1)-labeled copy as the canonical example for the 'Known limitation: nested closures and captured-scalar indirection' subsection. scripts/check_skill_examples.py: cleaned up ALLOWLIST (fix_allowlists re-introduced duplicate keys per the memory'd bug; manually deduped and re-verified 66 unique entries, 0 duplicates). Co-Authored-By: Claude <noreply@anthropic.invalid>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@SKILL.md`:
- Around line 2296-2307: Update the "Known Bugs and Workarounds" table in
SKILL.md to include the missing tracked issues `#515`, `#516`, and `#517`: add one
table row per issue with the Shape, a short Bug summary, the appropriate
Workaround text (copied/condensed from KNOWN_ISSUES.md), and the Issue column
linking to the GitHub issue; also update the trailing paragraph that enumerates
the likely-causing issues so it lists `#514/`#515/#516/#517 to keep this section
consistent with KNOWN_ISSUES.md and the documented short-term bug focus.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7d829aae-2d15-437e-94d4-386aa3be0d47
⛔ Files ignored due to path filters (2)
docs/SKILL.mdis excluded by!docs/**docs/llms-full.txtis excluded by!docs/**
📒 Files selected for processing (2)
SKILL.mdscripts/check_skill_examples.py
CodeRabbit caught that the Known Bugs and Workarounds table only had rows for #514/#475/#487 but KNOWN_ISSUES.md and the ROADMAP short-term queue already tracked #515/#516/#517 as well. Added three new rows: - GC collect faults (#515): collector walks past heap_ptr under allocation pressure. - Runtime trap diagnostics (#516): 'Runtime contract violation' label conflates contract-fail with OOM/overflow/etc. - No tail-call optimization (#517): tail-recursive iteration blows the call stack at ~tens of thousands of frames. Also rewrote the existing #514 row with the agent's sharpened characterization (closures + any heap capture, not 'nested closures'), and rewrote the trailing diagnostic-hint paragraph to map each common symptom to the specific issue it's likely caused by. No stale labels remain; the section mirrors the KNOWN_ISSUES.md Bugs table in scope. Co-Authored-By: Claude <noreply@anthropic.invalid>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@SKILL.md`:
- Around line 643-655: The broken repro uses invalid/placeholder names and
therefore demonstrates the wrong failure path; update the snippet so it uses a
valid array variable (replace some_array with a real variable name declared
earlier, e.g., let my_array = [10,20,30]) and correct field access on the heap
array type (use the actual variable name instead of `@Array`<Int>.1) so the
closure passed to array_fold captures the heap-typed value and triggers the
intended WASM validation/runtime error; ensure the example uses the same heap
type symbol `@Array`<Int>, references array_length(my_array) and nat_to_int(...)
correctly, and removes any invalid syntax so the repro reaches the `#514`
validation shape.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 58548c43-3afd-4f53-922c-1c21634f2651
⛔ Files ignored due to path filters (2)
docs/SKILL.mdis excluded by!docs/**docs/llms-full.txtis excluded by!docs/**
📒 Files selected for processing (1)
SKILL.md
Eleven open bugs total; all now in the ROADMAP implementation-order table: Campaign-ordered by impact: 1. #514 — closure heap capture 2. #515 — GC collect faults 3. #516 — runtime trap diagnostics 4. #517 — no tail-call optimization 5. #520 — Nat subtraction soundness 6. #475 — 10 pre-existing WASM translator bugs 7. #487 + #348 — GC allocator growth + worklist sizing (grouped) 8. #346 + #347 + #490 — opaque-handle hygiene (grouped) 9. #507 — Eq/Ord-dispatched array ops (enhancement, not bug) Slots 7 and 8 are grouped-issue lines because those bugs share implementation sites and are cheaper to fix together than apart. Also fixed SKILL.md allowlist shifts from the heap-capture example rewrite (CR round 3) — allowlist now 67 unique entries, AST-verified zero duplicates. No code changes. Pure coordination. Co-Authored-By: Claude <noreply@anthropic.invalid>
Summary
Documentation-only PR capturing the current state of Vera after the Stage 11 capability push — and, critically, documenting the four bugs surfaced by an agent trying to write Conway's Game of Life against v0.0.119.
The agent's friction transcript is the clearest empirical signal we've had about where the language's failure modes actually bite: the language has caught up to what agents need at the capability level, but the documentation and a handful of compiler bugs are the remaining walls.
What the agent surfaced
Eight documentation gaps + four compiler bugs:
[1, 2, 3]/[]never documented\x1bescape not supported →string_from_char_code(27)workaround\n/\t/\r/\0/\\/\"/\u{XXXX}) + explicit "not supported" list + two documented fallbacks(@Unit -> T)vs(-> T)ambiguityapply_fndiscoverable only by inspectingprelude.pyArray,String, ADT) fails at compile time$gc_collectitself faults at memory boundary under sustained allocation pressurefix_allowlists.pysilently creates duplicate dict keys (memory'd infeedback_spec_allowlist.md)The agent's own self-observation is captured on #516 and worth quoting as the motivation:
That's the gap this PR documents and the next few releases will close.
Structural additions to SKILL.md
Sections added, in order of appearance:
(-> T)vs(@Unit -> T)are both valid with different call-site arities.apply_fn(same block) — pinsapply_fnas the invocation mechanism forFn(T -> U)values.[]/[1, 2, 3], type inference rules, context-disambiguation from the postfix[]index operator.\x..,\v,\f,\a,\b), two documented workarounds (\u{XXXX}unicode escape orstring_from_char_code(N)).$allocgrows memory by only 1 page — large single allocations trap #487), each with a concrete workaround and tracking issue.KNOWN_ISSUES.md updates
Three new rows under "Bugs":
$gc_collectwalks past$heap_ptrto the linear-memory bound and traps.ROADMAP.md updates
Short-term queue re-sorted into a bug-killing campaign with the three freshly-surfaced bugs at the top:
Workflow meta-lesson
Two observations worth carrying forward (saved to session memory, not included in the PR):
feedback_coderabbit.mdafter PR Release v0.0.119: JSON typed accessors (#366) #511 round 3.Validation
check_skill_examples.py: 52 parsed + 67 allowlisted + 0 failedcheck_doc_counts.py: consistentcheck_site_assets.py: up-to-datemypy/pytestclean on the branchFiles
SKILL.md— new subsections (lines 164-ish, 380-ish, 596-ish, 1680-ish, 2290-ish) and updated existing onesKNOWN_ISSUES.md— three new bug rowsROADMAP.md— implementation-order table re-sorteddocs/SKILL.md,docs/llms-full.txt— site asset regenerationscripts/check_skill_examples.py— ALLOWLIST cleanupNo source code changes. No spec changes. No compiled-artefact changes.
Test plan
python scripts/check_skill_examples.pyexits 0python scripts/check_doc_counts.pyexits 0python scripts/check_site_assets.pyexits 0pytest tests/test_readme.py tests/test_build_site.pygreenstring_from_char_code)Closes #513
Summary by CodeRabbit