Docs sweep: address compiler-review findings 2026-05-18 (5 issues filed) by aallan · Pull Request #684 · aallan/vera

aallan · 2026-05-18T14:17:57Z

Summary

Docs sweep addressing findings from a strategic-mode compiler review on 2026-05-18, plus a small inline fix for #675 (E500 fix-text wording, reported externally by @rzyns on 2026-05-15) that came up while writing the PR. All 6 review findings independently verified against HEAD (v0.0.155, commit 08c1197) before acting.

Findings handled

Finding	Verified?	Action
P0-1 — `vera test` mis-classifies refuted functions	✅ Reproduced exactly; classifier at `vera/tester.py:393` only harvests `severity == "warning"` Tier 3 codes	Restored to `KNOWN_ISSUES.md` Bugs section. Reassigned from my dup #681 to @rzyns's earlier #674 (their report came 3 days first and is better-written). Fix is a separate ~half-day PR.
P0-2 — README overclaims primitive-safety guarantees	✅ Unguarded `@Int.1 / @Int.0` verifies cleanly	This PR: softened README, added FAQ Q&A, new `spec/06.4.3` paragraph; filed #680 for the implementation work
P1-1 — diagnostic-tagging discipline gap	✅ 22/98 sites fully tagged; E140 confirmed missing `Fix:`	Filed #682; added to ROADMAP CI tooling
P1-2 — Chapter 8 conformance gap	✅ All chapters 1–7, 9–10 present; ch08 absent	Filed #679; added to ROADMAP testing gaps
P2-1 — spec EBNF / Lark drift	⚠ Partially confirmed — nominal renames real; expression-precedence "drift" is deliberate notational choice	Filed #683 scoped to nominal renames
P2-2 — Tier 2 / `invariant()` strategic call	✅ Tier 2 (#427) genuinely long-running; `invariant()` already doc-resolved via PR #650	Strategic call deferred — flagged below for direction

Bonus inline fix: #675 (E500 fix text)

@rzyns also reported #675 — the E500 fix= text named only two repair classes (strengthen requires(...) / weaken ensures(...)) and missed the most common third class: fix the implementation to satisfy the existing contract. When E500 catches a typo in the function body, the user's intended repair is almost always "fix the typo" — the previous wording implicitly biased them toward weakening the contract instead.

New wording at vera/verifier.py:1305-1322:

Resolve the mismatch between the function body and its contract: fix the implementation so it satisfies this ensures() clause, strengthen requires(...) if the counterexample is outside the intended input domain, or weaken/change ensures(...) if the postcondition overstates the intended guarantee.

Implementation-repair comes first because it's the most common case in practice; the verifier is otherwise neutral about which artifact is the source of truth. Tightened test_violation_has_fix_suggestion to pin all three repair classes (the pre-existing assertion "precondition" or "postcondition" in fix.lower() survives the rewrite but would not have caught a regression dropping "implementation"; the new assertion does).

What changed

README.md — "Contracts the compiler proves" section reworded. Previous text overclaimed; new text correctly describes static vs runtime split with forward reference to #680.
FAQ.md — new Q&A "Does the compiler prove division-by-zero, out-of-bounds indexing, etc. can't happen?" walks through the static/runtime split, references #520 as the existing precedent.
spec/06-contracts.md — new §6.4.3 "Primitive Operation Safety". Subsequent sections renumbered.
KNOWN_ISSUES.md — Bugs section restored with #674 (using @rzyns's cleaner reproducer + CI-failure-mode explanation).
ROADMAP.md — added #679 (Ch 8 conformance), #680 (auto-inject obligations), #682 (diagnostic-tagging), #683 (grammar drift) under appropriate sections.
vera/verifier.py — E500 fix= text rewritten (3-class enumeration per E500 postcondition diagnostic fix text should mention implementation repair #675).
tests/test_verifier.py — test_violation_has_fix_suggestion tightened to pin all three repair classes.
Auxiliary — allowlist line-numbers updated for shifted FAQ/spec code blocks; site assets regenerated; TESTING.md test_verifier.py line count refreshed.

Strategic call deferred

P2-2 (Tier 2 verification / invariant() clauses) needs a direction:

Ship Tier 2 + invariant() — substantial work (2–4 weeks), closes spec-vs-implementation gap
Demote both to a "Future Features" appendix — one-day mechanical move; spec describes what the compiler does today
Leave as-is — status quo

The review pushed for option 2 as the cheapest move; option 1 as a stretch goal. No action in this PR pending direction.

Validation

Test plan

Each finding verified against current code before acting (verdict table above)
Filed issues link to specific code locations and include reproducers
E500 postcondition diagnostic fix text should mention implementation repair #675 fix is a localised wording change with a tightened test pinning the new contract
No regressions in any of the existing test files

Closes #675.

🤖 Generated with Claude Code

Summary by CodeRabbit

Documentation
- Added FAQ and updated README/spec/roadmap to clarify how preconditions affect division-by-zero, indexing and runtime traps, describe trap diagnostics with “Fix:” guidance, and note planned auto-injection of primitive-operation obligations.
Known Issues
- Documented misclassification where test can report functions as verified despite verifier counterexamples; added workaround.
Tests
- Strengthened regression test to require comprehensive repair-suggestion text in diagnostics.
Changelog
- Recorded documentation and diagnostic-fix entries.

A strategic-mode review on 2026-05-18 surfaced 6 findings. All 6 independently verified against HEAD (v0.0.155, commit 08c1197). This PR addresses the docs half — code fixes land separately under the filed issues. ## What changed **README.md** — softened the "Contracts the compiler proves" section. The previous wording ("Division by zero is not a runtime error — it is a type error. The compiler checks every call site to prove the divisor is non-zero.") overclaimed: the verifier checks contracts the programmer wrote, not auto-synthesised obligations on primitive operations. New wording correctly describes what's static (declared preconditions discharged at every call site) and what's runtime (unguarded primitives trap with Vera-native diagnostic), with a forward reference to #680 which tracks the implementation work to make the original claim true. **FAQ.md** — new Q&A "Does the compiler prove division-by-zero, out-of-bounds indexing, etc. can't happen?". Explicitly walks through the static/runtime split, references the #520 Nat- subtraction precedent (the one auto-synthesised obligation that exists today), and points at #680 + #427 as the long-term path. **spec/06-contracts.md** — new §6.4.3 "Primitive Operation Safety" between Call Site Verification and SMT Solver Integration. States explicitly that obligations on `a / b`, `a % b`, `arr[i]`, `string_at(s, i)`, etc. are not auto-synthesised; the exception is `@Nat` subtraction (#520). Cross-references #680 and #427. Subsequent sections re-numbered (§6.4.4 SMT Solver Integration, §6.4.5 Counterexample Reporting — was §6.4.3 and §6.4.4). **KNOWN_ISSUES.md** — restored the Bugs section with #681 (`vera test` reports refuted functions as VERIFIED). This was confirmed by the review with a clean reproducer: `vera verify` correctly emits E500 with counterexample, but `vera test` reports the buggy function as `VERIFIED (Tier 1)` with exit code 0. The classifier at `vera/tester.py:393` only harvests `severity == "warning"` Tier 3 codes; `severity == "error"` codes flow past untouched. Workaround documented: run `vera verify` before trusting `vera test` results. Fix is half a day's work, landing separately. **ROADMAP.md** — three new tracking items added under CI tooling (#682 diagnostic-tagging discipline, #683 grammar drift), one under verification depth (#680 auto-inject primitive obligations), one under testing gaps (#679 Chapter 8 conformance programs). **Auxiliary**: - `scripts/check_faq_examples.py` and `scripts/check_spec_examples.py` allowlist line numbers updated (auto-fixed via `python scripts/fix_allowlists.py --fix`) — the Abilities snippet in FAQ shifted by ~11 lines from the new Q&A insertion; the spec example in §6.4.5 shifted from line 323 to 338 from the new §6.4.3. - Site assets regenerated (`docs/llms-full.txt`, `docs/sitemap.xml`). ## Findings deferred from this PR | Finding | Status | |---|---| | **P0-1** (`vera test` mis-classification) | #681 filed; documented in KNOWN_ISSUES.md; fix is a separate ~half-day PR | | **P1-1** (diagnostic tagging) | #682 filed; ROADMAP entry added; multi-day mechanical work scoped for a follow-up PR | | **P1-2** (Ch 8 conformance) | #679 filed; ROADMAP entry added; ~day of work | | **P2-1** (grammar drift) | #683 filed; scoped to nominal renames only (expression-precedence ladders are deliberate notational difference) | | **P2-2** (Tier 2 / `invariant()` strategic call) | Tier 2 (#427) genuinely long-running; `invariant()` doc-state already resolved by PR #650. Strategic call deferred — flagged to Alasdair for direction. | ## Validation - [x] `pytest tests/` — 3,875 passed, 14 skipped, 16 stress-deselected - [x] `python scripts/check_conformance.py` — 86/86 - [x] `python scripts/check_examples.py` — 34/34 - [x] `python scripts/check_doc_counts.py` — consistent - [x] `python scripts/check_limitations_sync.py` — consistent (30 in KNOWN_ISSUES, 18 in vera/README, 4 across spec) - [x] `python scripts/check_readme_examples.py` — pass - [x] `python scripts/check_faq_examples.py` — pass (after allowlist update) - [x] `python scripts/check_spec_examples.py` — pass (after allowlist update) - [x] `python scripts/check_skill_examples.py` — pass - [x] `python scripts/check_site_assets.py` — up-to-date Co-Authored-By: Claude <noreply@anthropic.invalid>

coderabbitai · 2026-05-18T14:18:26Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: dec58810-2b44-41ef-b122-789eb7909316

📥 Commits

Reviewing files that changed from the base of the PR and between 77445b6 and 9417e32.

📒 Files selected for processing (1)

CHANGELOG.md

📝 Walkthrough

Walkthrough

This PR clarifies verifier behaviour for primitive operations: the verifier does not auto-synthesise well‑definedness obligations (division/modulo/indexing); programmers must supply requires or refinements. Spec, FAQ, README, roadmap, scripts, KNOWN_ISSUES, tests, and E500 diagnostic text were updated; runtime traps include a labelled "Fix".

Changes

Primitive Operation Safety Clarification

Layer / File(s)	Summary
Specification of primitive operation safety `spec/06-contracts.md`	New "Primitive Operation Safety" subsection defines which well-definedness obligations are auto-synthesised (none for general primitives; special-case `@Nat` subtraction). SMT solver integration and counterexample reporting headings renumbered.
User-facing documentation (FAQ and README) `FAQ.md`, `README.md`	FAQ answers whether the compiler guarantees non‑trapping (depends on `requires`/refinements). README rephrases contract verification as per-call-site SMT checks and documents runtime trapping with a Vera diagnostic and reference to issue `#680`.
Roadmap and known issues tracking `ROADMAP.md`, `KNOWN_ISSUES.md`	ROADMAP adds CI/verification-depth work items (diagnostic-tagging hook, grammar alignment, Tier 2 Z3 proving, auto-injection of primitive obligations) and a Chapter 8 conformance gap. KNOWN_ISSUES adds an `vera test` E500 misclassification entry (issue `#674`).
Script allowlist updates `scripts/check_faq_examples.py`, `scripts/check_spec_examples.py`	ALLOWLIST entries adjusted to new code-fence start lines after content shifts.
Tests and verifier diagnostic text `tests/test_verifier.py`, `vera/verifier.py`, `TESTING.md`	E500 `_report_violation` `fix` message updated to enumerate three repair classes and tests strengthened to assert presence of "implementation", "requires", and "ensures"/"postcondition" in the suggested fix; TESTING.md line count updated.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

#680: docs and ROADMAP explicitly reference planned auto-injection of primitive-safety obligations.
#674: KNOWN_ISSUES documents vera test misclassifying E500 counterexamples (reproducer and workaround added).

Suggested labels

compiler, tests, spec, ci, docs

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarises the main purpose of the PR: a comprehensive documentation update addressing compiler-review findings from 2026-05-18, with five new tracking issues filed.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch docs-2026-05-18-compiler-review-followups

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-05-18T14:20:50Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.02%. Comparing base (08c1197) to head (9417e32).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #684   +/-   ##
=======================================
  Coverage   91.02%   91.02%           
=======================================
  Files          60       60           
  Lines       23413    23413           
  Branches      259      259           
=======================================
  Hits        21312    21312           
  Misses       2094     2094           
  Partials        7        7

Flag	Coverage Δ
javascript	`57.36% <ø> (ø)`
python	`94.77% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@rzyns

External contributor @rzyns filed two careful reports on 2026-05-15 (#674 and #675). This commit addresses #675 inline with the existing docs PR (#684) and reassigns the KNOWN_ISSUES.md entry from #681 to #674 (my dupe of #674 from today's compiler-review sweep — closed as duplicate). ## #675 — E500 fix text names all three repair classes Pre-fix `fix=` text at `vera/verifier.py:1306-1308` named only two repair classes: strengthen `requires(...)` or weaken `ensures(...)`. It missed the third (often most common) class: fix the implementation to satisfy the existing contract. When E500 catches a typo in the function body (e.g. `@Int.0 + @Int.0 + 1` when the contract says `@Int.0 + @Int.0`), the user's intended repair is almost always "fix the typo" — the previous wording implicitly biased them toward weakening the contract instead. New wording lists all three repair classes neutrally: > Resolve the mismatch between the function body and its > contract: fix the implementation so it satisfies this > ensures() clause, strengthen requires(...) if the > counterexample is outside the intended input domain, or > weaken/change ensures(...) if the postcondition overstates > the intended guarantee. Implementation-repair comes first because it's the most common case in practice; the verifier is otherwise neutral about which artifact is "the source of truth". Tightened `tests/test_verifier.py::TestCounterexamples:: test_violation_has_fix_suggestion` to assert all three classes appear in the lowercased fix text. The pre-existing assertion (`"precondition" or "postcondition" in fix.lower()`) survives the rewrite but would not have caught a regression dropping "implementation"; the new assertion does. Closes #675. ## #674 / #681 reassignment #674 was filed by @rzyns on 2026-05-15 reporting that `vera test --json` reports `category: "verified"` for functions the verifier has refuted with E500. I filed the same bug as #681 today (2026-05-18) from a strategic compiler-review session without realising #674 existed. #681 is closed as a duplicate; the `KNOWN_ISSUES.md` Bugs entry now references #674 with a polished reproducer (@rzyns's `@Int + @int + 1` form is cleaner than my `@Nat * 2 + 1`) and explicit mention of the CI-consumes-JSON failure mode. The underlying bug remains open; the fix is a separate code PR (half a day's work — extend `vera/tester.py::_classify_ functions` to harvest `severity == "error"` codes too, add a `"failed"` category, set non-zero exit code). ## Validation - [x] `pytest tests/test_verifier.py::TestCounterexamples::test_violation_has_fix_suggestion -v` passes - [x] `pytest tests/` — 3,875 passed, 14 skipped, 16 stress-deselected - [x] `mypy vera/` clean - [x] `ruff check --select S vera/` clean - [x] `check_doc_counts.py`, `check_limitations_sync.py`, `check_readme_examples.py`, `check_faq_examples.py`, `check_spec_examples.py`, `check_site_assets.py` all consistent - [x] TESTING.md test_verifier.py line count: 2,094 → 2,124 Co-Authored-By: Claude <noreply@anthropic.invalid>

@rzyns

CI's check_changelog_updated.py correctly flagged that PR #684 touches vera/verifier.py and spec/06-contracts.md without a matching CHANGELOG.md entry. Adds an [Unreleased] section covering: - Fixed: #675 E500 fix-text rewording (external report from @rzyns) - Documentation: the 2026-05-18 compiler-review docs sweep (README softening, FAQ Q&A, spec section 6.4.3, KNOWN_ISSUES #674 reassignment, ROADMAP additions for #679/#680/#682/#683) Co-Authored-By: Claude <noreply@anthropic.invalid>

CodeRabbit posted three findings on rzyns's PR plus there's a CI gate (doc-counts) and a project convention (empty Bugs table) to address. Two additional minor refactors (DRY constant, public helper rename) included since they touch the same files. **CR Finding 1: _bad_vera missing encoding="utf-8"** (test portability). Pre-existing helper used 30+ times in test_cli.py since long before aallan#685, but CR caught it now because rzyns's new tests added more callers. Windows runs would fall back to cp1252 on the default text-mode encoding and corrupt UTF-8 sequences in Vera fixtures. CLAUDE.md "Cross-platform pitfalls" already flags this as a convention. One-line fix on the helper benefits every caller. **CR Finding 2: needs_pos.category != "failed" too weak** (test_tester.py). In `test_call_precondition_error_fails_caller_ not_callee`, the negative assertion would still pass if needs_pos were classified as "skipped" or any other non-failed bucket. Strengthened to `== "verified"` so the test actually pins that the callee is correctly proven, not just that it avoided the failed bucket. **CR Finding 3: failed_fns dict overwrite is order-dependent** (vera/tester.py). When a single function attracts multiple verifier errors (e.g. `ensures(false)` produces E500 AND `@Nat - @Nat` produces E502 on the same body — the `test_displayed_failed_function_does_not_double_count_verifier_ errors` test exercises exactly this shape), the loop overwrites the dict entry and the displayed `reason` flips with diagnostic iteration order. Switched to first-hit (`if name not in failed_fns`) — `setdefault`-style semantics, no judgement call on which code "should" win. **Doc counts: 7 stale numbers** (CI gate would fail). `check_doc_counts.py` flagged TESTING.md total, test_cli.py + test_tester.py + test_tester_coverage.py rows, and ROADMAP.md test count. Updated all 7 plus README.md test count for consistency: 3,905 → 3,916 total. All counts now consistent. **KNOWN_ISSUES.md empty table** (project convention). PR removed the aallan#674 row but left the table header `| Bug | Issue |` with no body. Per the convention established at aallan#673's release ("retain the Bugs section, and just have some text in it to say 'No known bugs'"), replaced the empty table with the convention line. **Nit 1: DRY** — `{"E500", "E501", "E502"}` was duplicated between vera/tester.py and vera/cli.py. Extracted as `VERIFICATION_ERROR_CODES = frozenset(...)` module-level in vera/tester.py; cli.py imports it. `frozenset` over plain set so the constant is immutable (matches its semantic role as a classifier-set constant). **Nit 2: private name reaching across modules** — `cli.py` was importing `_failed_function_name` from vera/tester.py. The single-underscore prefix signals "private by convention" but cli.py's reliance on it was a leak. Renamed `_failed_function_name` → `failed_function_name` (public); docstring updated to mention the cli.py call site explicitly. Validation: - pytest 3,886 / 3,886 (no test count change — the renames don't affect test behaviour, just the names) - mypy clean - ruff S clean - conformance 86/86, examples 34/34 - check_doc_counts.py consistent - limitations sync consistent - check_changelog_updated.py passes (existing [Unreleased] entry from rzyns's commit + the additional carry-over from PR aallan#684) - check_site_assets.py up-to-date Co-Authored-By: Claude <noreply@anthropic.invalid>

aallan mentioned this pull request May 18, 2026

vera test reports buggy_double as VERIFIED when verifier emits E500 (correctness, unsafe direction) #681

Closed

This was referenced May 18, 2026

vera test --json reports verified/ok for a postcondition failure caught by vera verify --json #674

Closed

E500 postcondition diagnostic fix text should mention implementation repair #675

Closed

aallan merged commit 3651993 into main May 18, 2026
22 checks passed

aallan deleted the docs-2026-05-18-compiler-review-followups branch May 18, 2026 14:56

aallan mentioned this pull request May 19, 2026

Release v0.0.156 #688

Merged

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs sweep: address compiler-review findings 2026-05-18 (5 issues filed)#684

Docs sweep: address compiler-review findings 2026-05-18 (5 issues filed)#684
aallan merged 3 commits into
mainfrom
docs-2026-05-18-compiler-review-followups

aallan commented May 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 18, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Suggested labels

Review ran into problems

Uh oh!

codecov Bot commented May 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aallan commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Findings handled

Bonus inline fix: #675 (E500 fix text)

What changed

Strategic call deferred

Validation

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Suggested labels

Review ran into problems

Uh oh!

codecov Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aallan commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading

codecov Bot commented May 18, 2026 •

edited

Loading