Skip to content

SKILL doc fixes + ROADMAP browser-target tier from Conway's Life agent experiments#601

Merged
aallan merged 5 commits into
mainfrom
claude/skill-doc-fixes-from-life-experiment
May 7, 2026
Merged

SKILL doc fixes + ROADMAP browser-target tier from Conway's Life agent experiments#601
aallan merged 5 commits into
mainfrom
claude/skill-doc-fixes-from-life-experiment

Conversation

@aallan

@aallan aallan commented May 7, 2026

Copy link
Copy Markdown
Owner

Summary

Single PR for everything that came out of two Conway's Life agent experiments on current main — splits into a SKILL.md doc-fixes half and a ROADMAP stabilisation-tier half. Originally proposed as two separate PRs (#611, since closed); folded together because both originate from the same session, touch independent files, and the merge-coordination cost of splitting consistently exceeds the review-clarity benefit.

SKILL.md doc fixes (commits 1ad3a75, 0d0231a, 5581235)

Three friction points captured by an agent writing Conway's Life from scratch (program worked first-try — 200 generations clean, 0 corruption, 32 contracts Tier 1 — but the agent's think-aloud transcript surfaced these gaps):

  1. array_length return type clarified. The internal registration says Int, but the type checker permits Int <: Nat (verifier-enforced via Z3) so the result flows freely into Nat positions. Old SKILL comment said "returns Int (always >= 0)" which is correct in registration terms but leads agents to reach for int_to_nat unnecessarily. Updated to "returns Nat (the array length, flows to either Nat or Int positions)".
  2. array_fold closure shape made explicit. SKILL pinned the De Bruijn slot mapping (@Int.0 = element, @Int.1 = accumulator) but never spelled out the resulting signature shape fn(@Acc, @Elem -> @Acc). One agent wrote a probe to confirm the parameter order. Three-line comment added.
  3. New Tuples subsection. SKILL mentioned @Tuple<Int, String> as a type once with no construction example. An agent considered using array_fold over a tuple table of (dr, dc) offset pairs, couldn't find tuple syntax, fell back to a verbose sequential let chain. Tuples do work — Tuple(...) constructor — but it has to be discovered. Added a brief subsection in the "Composite types" area.

Plus a decreases rule prominence fix folded in for #605 (the subsection opening previously said "Required on recursive functions" which is misleading — only required for pure recursive functions).

ROADMAP browser-target tier addition (commit 757435e)

Two of the bugs surfaced by a second Conway's Life agent experiment on --target browser are concrete enough to belong in the stabilisation tier — the agent's design memo (posted on #608) pinpointed both as runtime-only fixes that close the terminal-vs-browser seam without language changes:

  • #609IO.sleep via JSPI (timing half)
  • #610 — ANSI subset interpreter (rendering half)

Together: the same life.vera source would run unchanged on both targets.

Stabilisation tier is now 5 items (was 3). #595 demoted to last because it's contingent on an upstream wasmtime-py PyPI release. Agent-integration tier renumbered 4/5/6 → 6/7/8.

Side-effect: allowlist re-anchoring (commit 1ad3a75)

scripts/check_skill_examples.py's allowlist line numbers re-anchored after the SKILL line offsets shifted; one stale redundant entry pruned (Non-exhaustive Match section had three allowlist entries but only two actual code blocks); one mis-anchored entry corrected at #1905 (a "bare @Int + @Int" allowlist entry was suppressing a parseable full-function example) — caught by CodeRabbit on review and addressed in 0d0231a.

Issues filed during this experimental session (9 new)

# Type Title Tracked in
#602 bug String interp WASM type mismatch KNOWN_ISSUES
#603 enh Browser runtime: marshalling helpers KNOWN_ISSUES
#604 bug Five prelude combinators skipped on WASM compile KNOWN_ISSUES
#605 doc decreases rule prominence KNOWN_ISSUES (partial fix folded into this PR)
#606 tooling fix_allowlists.py heuristic fragility KNOWN_ISSUES
#607 doc Int <: Nat subtyping undocumented KNOWN_ISSUES
#608 umbrella Terminal-vs-browser IO seam KNOWN_ISSUES
#609 enh JSPI-driven IO.sleep ROADMAP stabilisation tier
#610 enh ANSI subset interpreter ROADMAP stabilisation tier

What's NOT in this PR (deferred)

Test plan

  • All 56 SKILL.md vera blocks parse, 62 allowlisted, 0 failed
  • pytest tests/ -q — 3,747 passed, 14 skipped
  • check_doc_counts / check_limitations_sync / check_site_assets all clean
  • Compiler probe of Tuple(42, "hi") round-trip: type-checks ✓
  • All 9 issue references resolve to the correct existing issues

Summary by CodeRabbit

  • Documentation

    • Added tuples subsection with construction and destructuring guidance.
    • Corrected array_length return-type and clarified array_fold closure shape.
    • Appended browser-target limitations to docs and README.
  • Tooling

    • Adjusted example-checking allowlist to align with shifted documentation examples.
  • Known Issues

    • Documented invalid WASM for certain string interpolation cases and WASM-related prelude skips.
  • Roadmap Updates

    • Refined stabilisation roadmap with specific browser/runtime failure modes and priorities.

Closes #605

A second agent wrote Conway's Life cleanly on current main (works first
time, all 200 generations clean, 0 corruption signatures, all contracts
Tier 1).  Their think-aloud transcript surfaced three concrete SKILL.md
gaps:

1. array_length: returns Int per the registration but flows freely
   into Nat positions via the bidirectional Int <: Nat permitted by
   the type checker (verifier-enforced via Z3).  The old comment said
   "returns Int (always >= 0)" -- correct but agents read this and
   reach for int_to_nat unnecessarily before discovering empirically
   that they don't need it.  Updated to "returns Nat (the array
   length, flows to either Nat or Int positions)" matching user-
   visible behaviour.

2. array_fold closure shape: the SKILL pinned the De Bruijn slot
   mapping (@Int.0 = element, @Int.1 = accumulator) but never spelled
   out the resulting signature shape (fn(@acc, @elem -> @acc)).
   Agents have to derive it via the rightmost = .0 rule, and one
   wrote a probe to confirm before using array_fold for population
   counting.  Three-line comment added making the shape explicit.

3. Tuple construction: SKILL mentioned @tuple<Int, String> as a type
   exactly once (line 388) with no construction or destructuring
   example.  An agent considered using array_fold over a tuple table
   of (dr, dc) offset pairs for neighbour counting, hunted for tuple
   literal syntax, didn't find any, and fell back to a verbose
   sequential let chain.  Tuples DO work -- the constructor is just
   the upper-case Tuple name -- but it has to be discovered by guess
   or by reading the spec grammar.  Added a brief subsection in the
   "Composite types" area showing Tuple(...) construction and match
   destructuring.

Plus a side-effect: scripts/check_skill_examples.py allowlist line
numbers re-anchored to match the new SKILL.md line offsets.  Both
the array_fold edit and the Tuple subsection bumped subsequent line
numbers; fix_allowlists.py auto-shifted most entries but missed two
which were patched manually.

Validation:
- All 56 SKILL.md vera blocks parse, 62 allowlisted, 0 failed
- pytest tests/ -q: 3,747 passed, 14 skipped
- check_doc_counts / check_limitations_sync / check_site_assets clean

Co-Authored-By: Claude <noreply@anthropic.invalid>
@codecov

codecov Bot commented May 7, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.92%. Comparing base (20f3ccb) to head (f5afc0f).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #601   +/-   ##
=======================================
  Coverage   90.92%   90.92%           
=======================================
  Files          59       59           
  Lines       22997    22997           
  Branches      259      259           
=======================================
  Hits        20910    20910           
  Misses       2080     2080           
  Partials        7        7           
Flag Coverage Δ
javascript 57.36% <ø> (ø)
python 94.74% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@coderabbitai

coderabbitai Bot commented May 7, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 75907856-0443-4ae1-b070-ac631ac09a36

📥 Commits

Reviewing files that changed from the base of the PR and between 757435e and f5afc0f.

⛔ Files ignored due to path filters (2)
  • docs/SKILL.md is excluded by !docs/**
  • docs/llms-full.txt is excluded by !docs/**
📒 Files selected for processing (3)
  • SKILL.md
  • spec/12-runtime.md
  • vera/README.md

📝 Walkthrough

Walkthrough

SKILL.md adds a Tuples subsection and changes array_length(@array<Int>.0) return type to Nat. scripts/check_skill_examples.py ALLOWLIST is re-anchored and expanded for intentionally-unparseable Vera blocks. KNOWN_ISSUES, ROADMAP, CHANGELOG, spec, and vera README are updated accordingly.

Changes

Type Documentation and Validation Synchronisation

Layer / File(s) Summary
Type documentation
SKILL.md
Adds "Tuples" section explaining construction via Tuple(...), pattern matching with the same shape, absence of literal syntax, and Tuple<> == Unit. Updates array_length(@array<Int>.0) return type from Int to Nat.
Validation allowlist
scripts/check_skill_examples.py
ALLOWLIST mapping extended with FRAGMENT entries (inserting at line 419 and expanding coverage across later SKILL.md sections) to mark intentionally-unparseable Vera code blocks in type-alias, option/result, built-ins, markdown/HTML/regex, numeric, contracts, effects, strings/escape, common-mistakes, and import-syntax examples.
Known issues
KNOWN_ISSUES.md
Adds two new bug entries and four limitation entries plus a test-coverage note describing allowlist anchoring fragility and recommending content-fingerprint anchoring.
Release notes
CHANGELOG.md
Updates [Unreleased] with Documentation and Tooling bullets describing the SKILL.md fixes and allowlist re-anchoring in the validation script.
Roadmap
ROADMAP.md
Expands the runtime-stabilisation narrative and inserts new browser/macos stabilisation implementation items.
Runtime spec
spec/12-runtime.md
Documents browser runtime limitations for IO.sleep polling/freezes and missing ANSI escape interpretation in IO.print with suggested fix approaches.
Readme
vera/README.md
Adds a short "Current Limitations" entry noting browser-target timing and ANSI rendering issues for terminal-style programs.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Possibly related PRs

Suggested labels

compiler, spec, ci, docs

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarises the two primary changes: SKILL documentation fixes and the ROADMAP browser-target tier additions from Conway's Life experiments.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/skill-doc-fixes-from-life-experiment

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/check_skill_examples.py`:
- Around line 105-109: Remove the mis-anchored allowlist tuple that maps a
SKILL.md block to ("FRAGMENT", "Wrong: bare `@Int` + `@Int` without indices") — that
entry is suppressing a full-function example (“add effects”); either delete that
tuple or re-anchor it to the actual fragment location (or replace it with the
correct tuple for the full-function example). You can find this entry adjacent
to the other allowlist entries with messages "Wrong: missing contracts" and
"Correct: bare `@Int` + `@Int` (common mistakes)"; update the allowlist so the
fragment-only allowlist no longer points at the full-function example.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4c9b4a82-6305-437f-8fd7-fc8463045d5c

📥 Commits

Reviewing files that changed from the base of the PR and between 20f3ccb and 1ad3a75.

⛔ Files ignored due to path filters (2)
  • docs/SKILL.md is excluded by !docs/**
  • docs/llms-full.txt is excluded by !docs/**
📒 Files selected for processing (2)
  • SKILL.md
  • scripts/check_skill_examples.py

Comment thread scripts/check_skill_examples.py
CI fix:
- Added a [Unreleased] CHANGELOG entry covering the three SKILL fixes
  + the allowlist re-anchoring tooling change.  Previous push failed
  the lint job because the changelog hook was unsatisfied.

CodeRabbit fix (verified against current code):
- The allowlist tuple at line 1905 was suppressing a parseable
  example (CORRECT add-with-effects full function) -- removed.  The
  description "Wrong: bare @int + @int without indices" that lived
  there was actually the correct description for the bare-add code
  at line 1942 (where the entry's previous label "Common mistake
  example, bare if/else" was just wrong -- the block is bare add,
  not if/else).  Net change: drop the misplaced entry, fix the
  description on its rightful neighbour.

Bonus low-hanging fix folded in for #605:
- SKILL.md "decreases (termination)" subsection opening rewritten
  to lead with the pure-vs-effectful rule prominently rather than
  saying "Required on recursive functions" (which was misleading
  for effectful recursion).  Two recent agents writing Conway's
  Life had to triangulate this rule between FizzBuzz and the State
  example; the explicit one-liner lets future agents answer it on
  first read.  #605 stays open for the broader doc-prominence work
  (this is the ~5-line subset that was cheap to fold in).

Validation:
- 56 SKILL.md vera blocks parse, 62 allowlisted, 0 failed
- pytest tests/ -q: green
- check_doc_counts / check_limitations_sync / check_site_assets clean

Co-Authored-By: Claude <noreply@anthropic.invalid>
…terp)

Two concrete fix proposals from the second Conway's Life agent's
design memo, closing the timing and rendering halves of the terminal-
vs-browser IO seam without language changes.  Updated the #608 row to
point at both sub-issues so anyone reading KNOWN_ISSUES sees the full
fix path rather than just the framing.

Co-Authored-By: Claude <noreply@anthropic.invalid>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@KNOWN_ISSUES.md`:
- Line 46: Update the KNOWN_ISSUES.md wording so it no longer calls array_length
“formally `@Int`”; change the phrase to match the updated SKILL contract (e.g.,
say array_length “returns `Nat`” or is “annotated `@Nat`”) and remove any
wording that implies it is an `@Int`; ensure the entry still notes the
bidirectional `Int <: Nat` subtyping behavior and verifier-enforced refinement
while aligning the array_length description with the new SKILL phrasing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 1641f710-c02b-4f7f-a03d-1fd271ac0175

📥 Commits

Reviewing files that changed from the base of the PR and between 0d0231a and 5581235.

📒 Files selected for processing (1)
  • KNOWN_ISSUES.md

Comment thread KNOWN_ISSUES.md
Two Conway's Life agent experiments on --target browser surfaced
five concrete blockers (issues #602/#603/#604/#609/#610) and an
explicit design memo (#608) mapping each obstacle to a concrete
runtime-only fix.  The current stabilisation framing covers
codegen reliability and walker completeness but treats browser-
target reliability as a separate concern.  The agent's diagnosis
makes that split harder to defend: "write once, run anywhere"
is currently true for pure computation and approximate-to-false
for anything with timing or screen output.  Two of the fixes
(#609 JSPI-driven IO.sleep, #610 ANSI subset interpreter) close
the timing and rendering halves of the seam without language
changes -- adding them to the stabilisation tier commits to
"browser-target Vera is something you'd actually use" before
the agent-integration push.

Changes:
- Expanded the campaign-pattern list from two patterns to three;
  third pattern documents the browser-target seam with links to
  the umbrella issue (#608) and the five concrete blockers.
- Added #609 (JSPI-driven IO.sleep) as item 3 in the stabilisation
  tier, with rationale + the WebAssembly JSPI / Asyncify mechanism.
- Added #610 (ANSI subset interpreter) as item 4 in the same tier,
  paired with #609 -- together they close the seam and let life.vera
  (terminal version) run unchanged on --target browser.
- Renumbered #595 from item 3 to item 5 (now last in stabilisation,
  since it's contingent on an upstream wasmtime-py release).
- Renumbered the agent-integration tier items 4/5/6 -> 6/7/8 to
  match.
- Updated the "What moves when" gate from "#4 starts when #1-#3
  are closed" to "#6 starts when #1-#5 are closed".

Co-Authored-By: Claude <noreply@anthropic.invalid>
@aallan aallan changed the title SKILL.md: three doc fixes from Conway's Life agent experiment SKILL doc fixes + ROADMAP browser-target tier from Conway's Life agent experiments May 7, 2026
Audit pass requested before merging #601: walk every limitation list
across SKILL.md, vera/README.md, spec/, and KNOWN_ISSUES.md to (a)
remove anything we've closed off and (b) add anything we've newly
flagged.

Result of audit:
- No closed issues found stale-listed in any open-limitation table.
  All references to closed issues (#588, #593, #520, #589) are
  referential prose ('X turned out to be...', 'see #520's pattern')
  rather than open-issue listings.  Nothing to remove.
- Five limitation entries were missing across three docs:
  * SKILL.md Known Limitations: browser-target IO seam (#608).
  * SKILL.md Known Bugs and Workarounds: #602 (string interp invalid
    WASM with workaround) and #604 (prelude combinators skipped on
    browser target).
  * vera/README.md Current Limitations: browser-target IO seam.
  * spec/12-runtime.md "## 12.8 Limitations" replaced 'No open
    limitations.' with the two browser-runtime gaps (#609 JSPI sleep,
    #610 ANSI subset interpreter).
- spec/11-compilation.md left as 'No open limitations' deliberately;
  #602 and #604 are bugs that belong in KNOWN_ISSUES, not enduring
  compilation-model limitations of the spec.

Validation: SKILL examples parse (56 OK / 62 allowlisted / 0 failed),
doc counts consistent, limitations sync clean (36 in KNOWN_ISSUES,
18 in vera/README, 4 across spec chapters).

Co-Authored-By: Claude <noreply@anthropic.invalid>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Doc: 'decreases' rule for effectful recursive functions is buried in SKILL.md

1 participant