Skip to content

Layer 1 (+ Layer 2 status note) of #626: pre-commit gate against unexpected [E602]/[E604] silent skips#656

Merged
aallan merged 7 commits into
mainfrom
layer1-e602-gate
May 11, 2026
Merged

Layer 1 (+ Layer 2 status note) of #626: pre-commit gate against unexpected [E602]/[E604] silent skips#656
aallan merged 7 commits into
mainfrom
layer1-e602-gate

Conversation

@aallan

@aallan aallan commented May 11, 2026

Copy link
Copy Markdown
Owner

Summary

Layer 1 of #626 — convert silent translator-skip failures into loud build failures.

Adds scripts/check_e602_clean.py — a pre-commit + CI gate that fails when any compile of an example or conformance program emits [E602] (body unsupported) or [E604] (param unsupported) outside an explicit allowlist.

Note on Layer 2 status

Layer 2 of #626 ("fail-fast in closure-lift") is already implemented — it landed via #636 (closed 2026-05-09, shipped in v0.0.142). The commit-on-success pattern in vera/codegen/closures.py::_lift_pending_closures plus the parent-fn-drop path in vera/codegen/functions.py::_compile_fn together implement option (a) from #626's Layer 2 spec. No new code is needed in this PR for Layer 2 — the issue body just hadn't been updated to reflect #636's closure. Full status posted as a comment on #626.

Net effect of this PR + the existing #636 work: Layers 1 + 2 of #626 are both delivered. Layer 3 (audit 50+ return None sites in vera/wasm/ and vera/codegen/ and convert to structured exceptions) is multi-day and stays its own focused effort.

Why

The [E602] warning channel was the project's only signal for silent translator-skip failures (the cross-cutting pattern tracked by #626). Several long-standing instances of it were buried in every WASM compile — making it impossible to spot a new genuine silent skip without manually sifting through expected noise.

This came up concretely while looking at #604: the original issue framing said "five prelude combinators silently skipped", but on closer investigation only 3 of 5 actually fail at runtime — the other 2 work fine via mono and the warning is misleading. The diagnostic noise was hiding the real bug shape. See the #604 investigation comment for the refined diagnosis.

The gate makes a new silent skip a hard build failure unless explicitly allowlisted with a tracking-issue reference. Removing the noise unblocks the "convert silent failures to loud failures" goal in #626's body.

What the first run surfaced

8 silent skips in existing examples + conformance programs that were previously hidden by the prelude noise. Filed as #655:

All 11 entries (5 prelude + 6 newly-surfaced) are in the allowlist with their respective tracking issue. The allowlist is meant to shrink over time as the underlying bugs are fixed.

Files changed

Validation

  • python scripts/check_e602_clean.pyScanned 120 files, Clean: 120, Allowlisted skips suppressed: 11 known functions
  • python scripts/check_doc_counts.py — counts consistent (26 hooks)
  • python scripts/check_limitations_sync.py — 35 entries in KNOWN_ISSUES.md
  • Full pre-commit hook chain runs clean

Why this isn't a release

No version bump in this PR — Layer 1 is infrastructure (the gate + allowlist) without behaviour change to the compiler itself. The next user-visible behaviour change (likely #604's mono inference fix or #655's head codegen fix) is the natural moment to cut v0.0.145.

Pairs with

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added an automated quality gate that fails commits and CI when unexpected compiler-skip warnings ([E602]/[E604]) appear in examples and conformance tests.
  • Documentation

    • Updated contributor and testing guides to document the new pre-commit/CI check and revised hook counts.
    • Expanded known-issues and changelog with refined diagnostics and the cases covered by the new gate.
  • Chores

    • Extended local pre-commit and CI linting to run the new validation check.

Review Change Stack

…ent skips

Adds `scripts/check_e602_clean.py` — a pre-commit + CI gate that
fails when any compile of an example or conformance program emits
`[E602]` (body unsupported) or `[E604]` (param unsupported) outside
an explicit allowlist.

## Why

The `[E602]` warning channel was the project's only signal for
silent translator-skip failures (the cross-cutting pattern tracked
by #626).  Several long-standing instances of it were buried in
every WASM compile — five prelude combinators (`option_unwrap_or` /
`result_unwrap_or` / `option_map` / `option_and_then` /
`result_map`), plus six in `examples/*.vera` and
`tests/conformance/*.vera` — making it impossible to spot a new
genuine silent skip without manually sifting through expected noise.

The gate makes a new silent skip a hard build failure unless
explicitly allowlisted with a tracking-issue reference.  Removing
the noise unblocks the "convert silent failures to loud failures"
goal in the issue body.

## What the first run surfaced

8 silent skips in existing example + conformance programs that
were previously hidden by the noise.  Filed as #655:

- **Shape A** (7 functions: `option_unwrap_or`, `result_unwrap_or`,
  `identity`, `const`, `is_some`, `are_equal`, `cmp_sign`) —
  spurious warnings on generic decls; mono clones work end-to-end.
  Same root cause as #604's `_unwrap_or` half.
- **Shape B** (1 function: `head` in `examples/refinement_types.vera`)
  — genuine codegen gap.  Calling `head([1,2,3])` actually fails
  with `unknown func: $head` because the IndexExpr translator
  returns None for `@NonEmptyArray.0[0]` (refinement-of-Array
  alias).

All 11 (5 prelude + 6 newly-surfaced; counting `option_unwrap_or` /
`result_unwrap_or` once even though they appear in both groups
above conceptually) are allowlisted with their tracking issue
(#604 for the prelude, #655 for the rest).  The allowlist is
meant to shrink over time as the underlying bugs are fixed.

## Refined diagnosis of #604

During the investigation that triggered Layer 1, the original
"5 functions silently skipped" framing in #604 was refined.  Only
3 of 5 actually fail at runtime (`option_map` / `option_and_then` /
`result_map` — mono produces wrong type-arg suffix → call_indirect
signature mismatch).  The other 2 (`option_unwrap_or` /
`result_unwrap_or`) work end-to-end via mono; the warning is
spurious.  See the investigation comment on #604.

## Files changed

- **`scripts/check_e602_clean.py`** (new) — the gate.  Compiles
  each `.vera` file with `compile --wat --json` and parses the
  envelope for `[E602]` / `[E604]` warnings, filtering against
  `ALLOWED_SKIPS`.
- **`.pre-commit-config.yaml`** — wired `e602-clean` hook to fire
  on changes to `examples/*.vera`, `tests/conformance/*.vera`,
  `vera/*.py`, `vera/grammar.lark`, `vera/prelude.py`, or the
  check script itself.
- **`.github/workflows/ci.yml`** — added the check to the lint job,
  between `check_html_examples.py` and `check_site_assets.py`.
- **`KNOWN_ISSUES.md`** — refined the #604 entry to reflect the
  new diagnosis; added a row for #655.
- **`CONTRIBUTING.md`**, **`TESTING.md`** — hook count 25 → 26
  reflecting the new commit-stage hook.
- **`CHANGELOG.md`** — `[Unreleased]` entries under `### Added`.

## Validation

- `python scripts/check_e602_clean.py` — `Scanned 120 files, Clean: 120, Allowlisted skips suppressed: 11 known functions`
- `python scripts/check_doc_counts.py` — `Documentation counts are consistent (..., 26 hooks, ...)`
- `python scripts/check_limitations_sync.py` — `Limitation tracking is consistent (35 in KNOWN_ISSUES.md, ...)`

Files #655 (Layer-1 finding details).

Co-Authored-By: Claude <noreply@anthropic.invalid>
@coderabbitai

coderabbitai Bot commented May 11, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds Layer 1 of #626: a gate script that compiles example and conformance .vera files, extracts [E602]/[E604] warnings, compares them to an ALLOWED_SKIPS mapping, and fails pre-commit/CI on unexpected skips; plus pre-commit and CI wiring and documentation updates.

Changes

E602/E604 Silent Skip Gate

Layer / File(s) Summary
Gate Script Entry & Data Structure
scripts/check_e602_clean.py
Module docstring and ALLOWED_SKIPS mapping of allowed function names to (error_code, issue_number, reason) entries.
Per-file compilation & warning extraction
scripts/check_e602_clean.py
_extract_skips() runs vera.cli compile --wat --json per .vera file (60s timeout), parses JSON output, and extracts only [E602]/[E604] warnings with recovered function names; returns sentinel failures for parse/timeouts.
Aggregation & allowlist validation
scripts/check_e602_clean.py
_scan_paths() aggregates extracted warnings, verifies each warning against ALLOWED_SKIPS (code must match), formats unexpected-failure messages, and returns counts plus failure list.
Discovery, orchestration & exit logic
scripts/check_e602_clean.py
main() discovers .vera files under examples/ and tests/conformance/ (filtered by manifest levels), validates inputs, runs the scan, prints stats and failures, and exits 0/1.
Pre-commit & CI wiring
.pre-commit-config.yaml, .github/workflows/ci.yml
Add local pre-commit hook e602-clean configured to run the script against .vera examples, conformance, and relevant vera/ source; add a CI lint job step invoking scripts/check_e602_clean.py.
Documentation & Known Issues
CHANGELOG.md, KNOWN_ISSUES.md, CONTRIBUTING.md, TESTING.md
CHANGELOG documents the new gate and allowlist; KNOWN_ISSUES refines diagnosis and records allowlisted cases; CONTRIBUTING and TESTING update hook counts and CI/hook documentation.

Sequence Diagram(s)

sequenceDiagram
  participant PreCommitOrCI as Pre-commit/CI
  participant CheckScript as check_e602_clean.py
  participant FileSystem as File System
  participant VeraCLI as vera.cli compile
  participant ExitCode as Exit Code

  PreCommitOrCI->>CheckScript: invoke
  CheckScript->>FileSystem: discover examples/*.vera + tests/conformance/*.vera
  loop per .vera file
    CheckScript->>VeraCLI: compile --wat --json (60s)
    VeraCLI-->>CheckScript: JSON output (or non-JSON / timeout)
    CheckScript->>CheckScript: parse JSON -> extract E602/E604 warnings + function names
    CheckScript->>CheckScript: validate warnings vs ALLOWED_SKIPS
  end
  alt no unexpected failures
    CheckScript->>ExitCode: exit 0 + print summary
  else unexpected failures found
    CheckScript->>ExitCode: exit 1 + print failures to stderr
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

  • #626: Directly implements the Layer‑1 pre-commit/CI gate (script + wiring).

Suggested labels

ci, docs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly references the main change: a pre-commit/CI gate (Layer 1 of #626) that prevents unexpected [E602]/[E604] silent skips, which aligns with the core objective and all substantial file modifications.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch layer1-e602-gate

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented May 11, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.95%. Comparing base (d82eda5) to head (941b58b).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #656   +/-   ##
=======================================
  Coverage   90.95%   90.95%           
=======================================
  Files          59       59           
  Lines       23197    23197           
  Branches      259      259           
=======================================
  Hits        21098    21098           
  Misses       2092     2092           
  Partials        7        7           
Flag Coverage Δ
javascript 57.36% <ø> (ø)
python 94.73% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CONTRIBUTING.md`:
- Around line 88-90: The text contains a contradictory hook count: the first
sentence states "24 are configured at the commit stage" while the following
parenthetical lists "commit-stage hooks (23…)" — update both places to the same
correct number (choose either 24 or 23) so the numbers for "commit-stage hooks"
and the parenthetical match; make the change where the phrases "24 are
configured at the commit stage" and "The **commit-stage** hooks (23, each gated
to relevant files) include:" appear, and verify any nearby references to
"pre-commit install", "pre-commit install --hook-type pre-push",
"check-changelog-updated" and "uv-lock-check" remain unchanged.

In `@scripts/check_e602_clean.py`:
- Around line 22-26: Update the module docstring in scripts/check_e602_clean.py
to reflect the current allowlist scope by mentioning that, in addition to the
five `#604` prelude combinators, the allowlist also includes the `#655` entries;
edit the docstring text (the top-of-file module docstring) so it clearly states
both issue numbers (`#604` and `#655`) and briefly why those entries are allowed to
avoid future drift.
- Around line 128-133: The subprocess.run call that assigns to result (the
invocation of [sys.executable, "-m", "vera.cli", "compile", "--wat", "--json",
file]) needs a timeout to avoid hanging; update the subprocess.run call in
scripts/check_e602_clean.py to include a timeout argument (e.g., timeout=30 or a
configurable constant) and handle TimeoutExpired by catching
subprocess.TimeoutExpired around that call (log an error and treat as a failed
compile), so the script fails fast instead of blocking indefinitely.

In `@TESTING.md`:
- Line 447: Update the "Full list" section so it no longer claims completeness
or actually includes the missing gate hooks: either add a new table row(s) for
the E602 and E604 gate hook(s) with the same columns/format as the other hooks
(matching how check-changelog-updated and uv-lock-check are documented) or
change the heading/text to avoid saying "Full list" (e.g., "Configured hooks" or
"Representative list") so it doesn't assert completeness; ensure you reference
the hook identifiers E602 and E604 (or the exact internal hook names used in the
repo) so readers can map them to the pre-commit configuration.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a6e249a0-8db5-4d9f-a894-50e6d88b75d3

📥 Commits

Reviewing files that changed from the base of the PR and between d82eda5 and f7588c1.

📒 Files selected for processing (7)
  • .github/workflows/ci.yml
  • .pre-commit-config.yaml
  • CHANGELOG.md
  • CONTRIBUTING.md
  • KNOWN_ISSUES.md
  • TESTING.md
  • scripts/check_e602_clean.py

Comment thread CONTRIBUTING.md Outdated
Comment thread scripts/check_e602_clean.py Outdated
Comment thread scripts/check_e602_clean.py
Comment thread scripts/check_e602_clean.py Outdated
Comment thread TESTING.md
Four valid findings, all fixed:

1. **`CONTRIBUTING.md` line 90** — contradictory hook count.
   Line 88 said "24 are configured at the commit stage" but
   line 90's parenthetical said "(23, each gated to relevant
   files)".  Updated 23 → 24 for consistency.

2. **`TESTING.md` Pre-commit Hooks "Full list"** — was missing
   the new `check_e602_clean.py` row.  Added row between
   `check_html_examples.py` and `check_doc_counts.py` with the
   description and #626 link.  Also updated the CI Pipeline
   table's lint job description to include `check_e602_clean.py`
   in the script enumeration.

3. **`scripts/check_e602_clean.py` module docstring** — referenced
   only `#604` even though the allowlist now also covers `#655`
   entries.  Expanded the docstring to enumerate both groups (5
   prelude tracked by `#604`, 6 user-code cases tracked by
   `#655`) with per-group rationale so the file's scope is
   clear without diving into ``ALLOWED_SKIPS``.

4. **`scripts/check_e602_clean.py` subprocess timeout** — the
   `subprocess.run` call had no timeout, so a pathological
   compile (infinite loop, Z3 stall) could block the gate
   indefinitely.  Added `timeout=60` and a `TimeoutExpired`
   handler that surfaces the timeout as a `("TIMEOUT", file,
   reason)` failure tuple, treated as a hard failure in
   `_scan_paths` like `PARSE_ERROR`.  60s budget matches the
   longest existing per-file timeout in any check script
   (`check_html_examples.py`'s `vera verify` invocation).

Co-Authored-By: Claude <noreply@anthropic.invalid>
@aallan

aallan commented May 11, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented May 11, 2026

Copy link
Copy Markdown
✅ Actions performed

Full review triggered.

@aallan aallan changed the title Layer 1 of #626: pre-commit gate against unexpected [E602]/[E604] silent skips Layer 1 (+ Layer 2 status note) of #626: pre-commit gate against unexpected [E602]/[E604] silent skips May 11, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
scripts/check_e602_clean.py (1)

164-187: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle ok: false JSON compile envelopes as hard failures.

At Line 165 onward, JSON is parsed but envelope["ok"] is never checked. A compile can fail with valid JSON and no [E602]/[E604] warnings, and this path is currently treated as clean.

Proposed fix
     try:
         envelope = json.loads(result.stdout)
     except json.JSONDecodeError:
         # Not JSON — compile produced something else (CLI error).
         # Surface as a failure with the stderr text inline.
         return [("PARSE_ERROR", file, result.stderr.strip()[:200])]
+
+    if envelope.get("ok") is False:
+        diagnostics = (
+            envelope.get("errors")
+            or envelope.get("diagnostics")
+            or []
+        )
+        first_error = "compile failed"
+        if diagnostics and isinstance(diagnostics[0], dict):
+            first_error = (
+                diagnostics[0].get("description")
+                or diagnostics[0].get("message")
+                or first_error
+            )
+        return [("COMPILE_ERROR", "", first_error)]
 
     skips: list[tuple[str, str, str]] = []
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/check_e602_clean.py` around lines 164 - 187, The code parses the
compiler JSON into the envelope variable but never checks envelope["ok"], so
valid JSON that indicates a compile failure is treated as clean; after parsing
(where envelope is set) add a check for envelope.get("ok") being falsy and
immediately return a hard failure tuple (e.g., ("COMPILE_ERROR", file, ...))
using the envelope's error message or summary (falling back to result.stderr)
truncated to 200 chars; keep using the existing return shape and types so
callers of this function that expect a list of (code, file, message) tuples
still work.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@scripts/check_e602_clean.py`:
- Around line 164-187: The code parses the compiler JSON into the envelope
variable but never checks envelope["ok"], so valid JSON that indicates a compile
failure is treated as clean; after parsing (where envelope is set) add a check
for envelope.get("ok") being falsy and immediately return a hard failure tuple
(e.g., ("COMPILE_ERROR", file, ...)) using the envelope's error message or
summary (falling back to result.stderr) truncated to 200 chars; keep using the
existing return shape and types so callers of this function that expect a list
of (code, file, message) tuples still work.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 90d880b3-1981-41c0-9a5d-a7ea2b2ef827

📥 Commits

Reviewing files that changed from the base of the PR and between f7588c1 and 413d1ca.

📒 Files selected for processing (3)
  • CONTRIBUTING.md
  • TESTING.md
  • scripts/check_e602_clean.py

@aallan

aallan commented May 11, 2026

Copy link
Copy Markdown
Owner Author

Cross-reference: Layers 1 + 2 of #626 delivered

Three-way link for the record:

Once this PR merges, Layers 1 + 2 of #626 are both delivered. Layer 3 (audit 50+ return None sites in vera/wasm/ and vera/codegen/ and convert to structured CodegenSkip / CodegenInvariantError) is multi-day and will be filed as its own focused tracker.

Both #626 and #636 now have comments pointing back to this PR and to each other so the cross-reference chain is closed.

…k` conformance programs

CodeRabbit found a real gap: the gate parsed the compile JSON
envelope but never checked `envelope["ok"]`, so a well-formed
JSON envelope indicating a hard compile failure (parse error,
type error, unresolved symbol) was treated as clean as long as
no `[E602]` / `[E604]` warnings fired.  A `vera compile` that
fails entirely could slip through the gate.

Fixed by checking `envelope.get("ok", True)` after parse;
when False, surface a `("COMPILE_ERROR", file, msg)` failure
tuple using the first diagnostic's `[error_code]` +
description.  Falls back to `result.stderr` (truncated to 200
chars) if the envelope has no diagnostics.

`_scan_paths` now treats `COMPILE_ERROR` identically to
`PARSE_ERROR` and `TIMEOUT` — all three are hard failures
distinct from per-function `[E602]` / `[E604]` skips.

## Surfaced finding from the new check

Running the strengthened gate found one previously-hidden case:
`tests/conformance/ch03_typed_holes.vera` triggers `[E614]
Program contains a typed hole; fill all holes before compiling`
— **by design**.  The conformance program exists specifically to
demonstrate that the compiler rejects programs with unfilled
typed holes at compile time.  This is the canonical
`level: "check"` shape — a conformance program required to pass
`vera check` but NOT to compile.

## How the script now handles `level: check`

Reads `tests/conformance/manifest.json` and filters conformance
programs to those declared `level: "verify"` or `level: "run"` —
the levels where successful compilation is a requirement.  The
4 `level: "check"` programs (typed_holes plus three others) are
skipped entirely.

Net effect: the gate now scans 116 files (34 examples + 82
verify/run-level conformance, down from 86 total conformance).
A `level: "check"` program that fails compilation is by design;
a `level: "verify"` or `level: "run"` program that emits an
unexpected `[E602]` / `[E604]` (or fails compilation outright)
is a hard fail.

Co-Authored-By: Claude <noreply@anthropic.invalid>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
scripts/check_e602_clean.py (1)

134-136: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail closed on JSON compile failures (ok: false).

Line 165 parses the compile envelope but never checks ok. A file can fail compilation without emitting [E602]/[E604], yet still be counted as clean, which weakens the gate’s core guarantee.

Proposed fix
@@
     try:
         envelope = json.loads(result.stdout)
     except json.JSONDecodeError:
         # Not JSON — compile produced something else (CLI error).
         # Surface as a failure with the stderr text inline.
         return [("PARSE_ERROR", file, result.stderr.strip()[:200])]
+
+    if envelope.get("ok") is False:
+        diagnostics = (
+            envelope.get("errors")
+            or envelope.get("diagnostics")
+            or []
+        )
+        first_error = "compile failed"
+        if diagnostics and isinstance(diagnostics[0], dict):
+            first_error = diagnostics[0].get("description", first_error)
+        return [("COMPILE_ERROR", "", first_error)]
 
     skips: list[tuple[str, str, str]] = []

Also applies to: 165-187

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/check_e602_clean.py` around lines 134 - 136, The code that parses the
compile JSON envelope never checks the envelope's "ok" flag, so add a check
after the JSON is parsed (where the variable is named like envelope or
compile_envelope) to treat an explicit ok: false as a compile failure: if not
envelope.get("ok", True) then record it as a failure (e.g., increment the
failure counter or set the overall clean flag/exit code), include the envelope's
error details in the failure output/log, and ensure this branch prevents the
file from being reported as clean; update the same logic block that currently
inspects [E602]/[E604] so both diagnostic entries and ok:false are considered
failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@TESTING.md`:
- Line 488: Update the "lint" row in TESTING.md so it matches the CI workflow:
add the missing checks `check_changelog_updated.py` and `check_doc_counts.py`
alongside the existing `check_e602_clean.py` entry (keep the other scripts and
`ruff check --select S vera/` intact) to ensure the documented command list
reflects the actual `.github/workflows/ci.yml` steps.

---

Duplicate comments:
In `@scripts/check_e602_clean.py`:
- Around line 134-136: The code that parses the compile JSON envelope never
checks the envelope's "ok" flag, so add a check after the JSON is parsed (where
the variable is named like envelope or compile_envelope) to treat an explicit
ok: false as a compile failure: if not envelope.get("ok", True) then record it
as a failure (e.g., increment the failure counter or set the overall clean
flag/exit code), include the envelope's error details in the failure output/log,
and ensure this branch prevents the file from being reported as clean; update
the same logic block that currently inspects [E602]/[E604] so both diagnostic
entries and ok:false are considered failures.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c2039a20-8622-4d99-80a0-405e7d05f844

📥 Commits

Reviewing files that changed from the base of the PR and between d82eda5 and 413d1ca.

📒 Files selected for processing (7)
  • .github/workflows/ci.yml
  • .pre-commit-config.yaml
  • CHANGELOG.md
  • CONTRIBUTING.md
  • KNOWN_ISSUES.md
  • TESTING.md
  • scripts/check_e602_clean.py

Comment thread TESTING.md Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/check_e602_clean.py`:
- Around line 261-271: Currently the script treats manifest.json as optional and
silently skips manifest-listed files that don't exist; change it to fail fast:
if manifest_path does not exist, exit with a non-zero error (or raise) reporting
the missing manifest; when iterating manifest entries (variable manifest and
entry), if any referenced path (path variable built from repo_root /
"tests/conformance" / entry["file"]) is missing, immediately raise/exit with a
clear error mentioning the missing file and the manifest entry instead of
silently dropping it; keep populating conformance only after validating all
entries so conformance.sort() runs on a fully validated list.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a7034b79-e412-4e09-ba84-41e62d15549c

📥 Commits

Reviewing files that changed from the base of the PR and between 413d1ca and 8a179fd.

📒 Files selected for processing (2)
  • TESTING.md
  • scripts/check_e602_clean.py

Comment thread scripts/check_e602_clean.py Outdated
…isted file (#626 Layer 1)

CodeRabbit found that the manifest-driven conformance filter silently
masked two state-inconsistency cases:

1. Missing `tests/conformance/manifest.json` → conformance list
   stayed empty; the gate scanned only examples.  No error surfaced.

2. Manifest entry referencing a file that doesn't exist on disk →
   silently dropped.  A partial scan set hides regressions in the
   missing program.

Both cases now fail fast with clear stderr messages:

- Missing manifest: "ERROR: conformance manifest not found at <path>"
  before any compile work.
- Missing files: collect all manifest entries with missing files
  first, then print one row per missing file with its entry id +
  filename, then refuse to run with rationale.  Collecting first
  surfaces the full discrepancy in one message rather than aborting
  on the first missing file.

Validated both paths by (a) temporarily moving the manifest aside
and (b) appending a fake entry referencing a nonexistent file —
both produce the expected error and exit 1.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/check_e602_clean.py`:
- Around line 58-124: ALLOWED_SKIPS is currently keyed only by function name
(the ALLOWED_SKIPS dict) which allows collisions across files; change the
allowlist to key by a unique source identifier plus function name (e.g. use
tuple (relative_path, fn_name) or (manifest_entry_id, fn_name) as the dict key)
and update all lookup sites that currently check by fn_name alone (the
suppression logic that references fn_name) to construct and use the same
composite key; ensure any code that builds or populates ALLOWED_SKIPS (the
literal dict and any loader) is adjusted to include the source identifier for
each entry and adapt tests/uses accordingly.
- Around line 221-223: The current branch that appends to unexpected when code
in ("PARSE_ERROR","TIMEOUT","COMPILE_ERROR") wrongly treats hard failures as
E602/E604 skips; update the logic in the handler that inspects the variable code
(around the block that appends to unexpected) to split these cases: keep
unexpected for actual skip codes (E602/E604), but create/append to a separate
hard_failures (or failures) collection for "PARSE_ERROR", "COMPILE_ERROR", and
"TIMEOUT" and ensure downstream stderr summary generation uses that collection
and emits distinct wording for hard failures versus skips; apply the same change
for the analogous block in the 330-345 region so summaries accurately reflect
failure mode.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f801b304-0618-456f-a9b8-e3020337bb17

📥 Commits

Reviewing files that changed from the base of the PR and between 8a179fd and 2304724.

📒 Files selected for processing (1)
  • scripts/check_e602_clean.py

Comment thread scripts/check_e602_clean.py
Comment thread scripts/check_e602_clean.py
CodeRabbit found that the gate's stderr summary collapsed two
distinct failure categories into a single "unexpected [E602]/[E604]
skips" message: per-function silent skips (the gate's core
purpose) and file-level hard failures (PARSE_ERROR / TIMEOUT /
COMPILE_ERROR — file couldn't be evaluated at all).  The wording
was inaccurate when hard failures were present; a user reading
the summary couldn't tell whether the gate had found a new
silent skip or whether the compile pipeline itself was broken.

Split into two categories with distinct semantics + reporting:

- `skip_failures` — unexpected per-function [E602]/[E604]
  warnings outside the allowlist.  The core silent-skip detection
  the gate exists for.
- `hard_failures` — file-level: COMPILE_ERROR, PARSE_ERROR,
  TIMEOUT.  File couldn't be evaluated at all.

`_scan_paths` now returns `(clean, skip_failures, hard_failures)`
instead of `(clean, failures)`.  `main` prints hard failures
first (file-level problems before per-function detail), each
section with its own rationale paragraph.  Exit code 1 if either
category is non-empty.

Verified both empty (clean run) and non-empty (manifest-listed
missing file from CodeRabbit #3 testing) produce the expected
output.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/check_e602_clean.py`:
- Around line 242-253: The current allowlist handling (ALLOWED_SKIPS lookup
inside the loop that appends to unexpected_skips) never records which allowlist
entries were actually used, letting stale keys silently suppress future
regressions; add a tracking set (e.g., used_allowlist_keys) and when you match
an allowlist entry (fn_name in ALLOWED_SKIPS) and the code equals the
expected_code mark that allowlist key as used, then after the scan iterate
ALLOWED_SKIPS keys and append a diagnostic for any key not seen (or otherwise
surface them) so unused/stale entries are reported; update the same logic around
the other block mentioned (the similar code at lines handling 342-351) to also
mark usage.
- Around line 297-310: The manifest loading and iteration must fail cleanly:
wrap the manifest_path.open/json.load call (where manifest is assigned) to catch
UnicodeDecodeError and json.JSONDecodeError and emit a clear error via the
existing logger and sys.exit(1); and when iterating the manifest (the for entry
in manifest loop that populates missing_files and candidate_paths), avoid
KeyError by checking 'file' exists (use entry.get("file") or "'file' in entry")
before building path = repo_root / "tests/conformance" / entry["file"], and if
an entry is malformed emit a descriptive error (including entry id if present)
and exit. Ensure you keep missing_files and candidate_paths logic but validate
manifest structure first so malformed content produces a clean logged exit
instead of an exception.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8a8df98f-3ad1-4fbc-969b-23968b2136e0

📥 Commits

Reviewing files that changed from the base of the PR and between 2304724 and 8e6f3b8.

📒 Files selected for processing (1)
  • scripts/check_e602_clean.py

Comment thread scripts/check_e602_clean.py
Comment thread scripts/check_e602_clean.py Outdated
…parsing

Two more valid CodeRabbit findings, both fixed:

## Stale-allowlist tracking

ALLOWED_SKIPS entries that never matched any warning during a
scan were silently suppressing nothing — and would silently mask
a future regression that re-introduces the same skip.  Mirrors
the pattern already in scripts/check_skill_examples.py.

Added a 'used_allowlist: set[str]' parameter to _scan_paths that
records each fn_name → allowlist match.  main collates 'set(ALLOWED_SKIPS) -
used_allowlist' and prints a 'STALE ALLOWLIST ENTRIES' section
with each stale entry's [code], tracking issue, and reason —
plus rationale paragraph telling the user to remove the entry
unless the affected program is being deliberately removed.  Exit
code 1 on any stale entries.

Verified by injecting a fake 'fake_stale_fn' entry: gate prints
the stale section + rationale + exits 1.

## Manifest-defensive parsing

The manifest_path.open + json.load chain previously assumed (a)
the file was valid UTF-8, (b) the parsed JSON was a list, (c)
every entry was a dict, (d) every dict had a 'file' key.  A
malformed manifest would crash with a Python traceback rather
than the structured stderr diagnostic the gate promises.

Three new defensive checks:

- 'except UnicodeDecodeError' on the open + 'except
  json.JSONDecodeError' on the parse, each producing a structured
  error message with file path + cause.
- 'isinstance(manifest, list)' check after parse with
  '{type}'.__name__ wording.
- Per-entry: 'isinstance(entry, dict)' check + explicit
  'entry.get("file")' validation with 'isinstance + non-empty
  string' shape check.  Errors include the entry id (or
  '<entry N>' fallback) for debugging.

Verified all three paths empirically: invalid JSON, JSON object
instead of array, and entry missing 'file' all produce clean
ERROR messages + exit 1.

Validated final clean run: 'Allowlisted skips suppressed: 11
known functions (11 matched, 0 stale)' — no behaviour change
when allowlist is current and manifest is well-formed.
@aallan aallan merged commit 637cddd into main May 11, 2026
22 checks passed
@aallan aallan deleted the layer1-e602-gate branch May 11, 2026 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant