Skip to content

Fix vera test verifier error handling (#674)#685

Merged
aallan merged 3 commits into
aallan:mainfrom
rzyns:fix-test-json-verifier-errors-674
May 19, 2026
Merged

Fix vera test verifier error handling (#674)#685
aallan merged 3 commits into
aallan:mainfrom
rzyns:fix-test-json-verifier-errors-674

Conversation

@rzyns

@rzyns rzyns commented May 18, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #674.

vera test now fails closed when the verifier reports E500/E501/E502 diagnostics instead of letting verifier-refuted functions fall through as "verified" / "Tier 1 (proved)".

This makes vera test --json safe as a green contract gate again for the verifier-error cases discussed in #674.

Behavior changes

  • Preserve verifier error diagnostics in TestResult.diagnostics.
  • Add/handle a "failed" function category for verifier-refuted public functions.
  • Surface TestSummary.failed in JSON and human output.
  • Return non-zero from vera test / vera test --json when result diagnostics contain errors.
  • Set JSON "ok": false when verifier errors are present.
  • Attribute:
    • E500 postcondition failures to the function named in in function '...'
    • E501 call-site precondition failures to the caller, not the callee
    • E502 @Nat subtraction underflow to the function named in in '...'
  • Keep --fn rows filtered to the selected function while still failing closed on whole-file verifier errors.
  • Surface private/non-displayed verifier failures in diagnostics and human summaries instead of silently dropping them.
  • Keep Tier 3 E700 runtime contract failures distinct from static verifier errors in human summaries.
  • Avoid double-counting multiple verifier diagnostics that are already represented by a displayed failed function.

Tests added / updated

Regression coverage now includes:

  • vera test --json returns nonzero and ok: false for E500 verifier errors.
  • Human output shows verifier-refuted functions as FAILED.
  • Private verifier failures are reported even when no private function row is displayed.
  • --fn keeps function rows filtered but still fails closed on unselected whole-file verifier errors.
  • E501 is attributed to the caller rather than the callee.
  • E502 @Nat underflow is classified as a failed verifier error.
  • Static verifier failures are summarized separately from Tier 3 runtime trial failures.
  • Tier 3 E700 runtime failures are not summarized as verifier errors.
  • Multiple verifier diagnostics on one displayed failed function are not double-counted as unlisted verifier errors.
  • CLI dispatch for vera test --json returns nonzero for verifier errors.

Verification

Ran locally:

pytest tests/test_tester.py tests/test_tester_coverage.py tests/test_cli.py -q
276 passed
pytest tests/ -q
3886 passed, 14 skipped, 16 deselected
mypy vera/
Success: no issues found in 59 source files
python scripts/check_conformance.py
All 86 conformance programs pass.
python scripts/check_examples.py
All 34 examples pass (check + verify).
pytest tests/test_conformance.py -q
416 passed, 14 skipped

Also checked:

git diff --check

and a staged added-line secret scan; both were clean.

Notes

This follows the intended direction from the maintainer comment on #674:

  • harvest E500/E501/E502 verifier errors,
  • introduce "failed",
  • make failed status take precedence over verified/Tier 3 classification,
  • surface TestSummary.failed,
  • return non-zero,
  • and add regression coverage.

I kept the displayed --fn function list scoped to the selected function, but made diagnostics/exit status reflect whole-file verifier errors so --fn cannot false-green a file with verifier-refuted contracts.

Summary by CodeRabbit

  • Bug Fixes

    • Verifier failures (E500, E501, E502) now cause tests to fail with non‑zero exit codes, JSON ok:false and preserved diagnostics, and human output shows failed functions plus a Diagnostics section; --fn filtering still surfaces whole‑file verifier errors.
  • New Features

    • Improved error attribution and separate counts for static verifier failures vs runtime (Tier‑3) failures; avoids double‑counting multiple diagnostics; E700 remains distinct.
  • Tests

    • Expanded regression and integration coverage for JSON/human output, filtering, attribution and exit codes.
  • Documentation

    • CHANGELOG, KNOWN_ISSUES, README, ROADMAP and TESTING updated.

Review Change Stack

@rzyns rzyns requested a review from aallan as a code owner May 18, 2026 17:48
@coderabbitai

coderabbitai Bot commented May 18, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0481a236-87ae-45e8-96b5-1bf21e873f0c

📥 Commits

Reviewing files that changed from the base of the PR and between 8d12298 and a7b0c61.

📒 Files selected for processing (6)
  • README.md
  • ROADMAP.md
  • TESTING.md
  • tests/test_cli.py
  • vera/cli.py
  • vera/tester.py

📝 Walkthrough

Walkthrough

Verifier diagnostics E500/E501/E502 are treated as errors: the test engine marks implicated functions "failed", preserves diagnostics, JSON "ok" and process exit codes fail, CLI prints a Diagnostics section, and summaries separate static verifier failures from Tier 3 runtime failures.

Changes

Verifier error classification and CLI wiring

Layer / File(s) Summary
Result type contract updates
vera/tester.py
FunctionTestResult.category and TestSummary.failed now include verifier-refuted "failed"; _VERIFICATION_ERROR_CODES added and TestSummary.unlisted_errors introduced.
Verifier error collection in test engine
vera/tester.py
_TestEngine.run() harvests verifier diagnostics (severity == "error"), initialises top-level diagnostics from them, and records "failed" FunctionTestResult entries with zero trials.
Verifier-refuted function classification
vera/tester.py
_classify_functions() scans diagnostics for E500/E501/E502, extracts implicated function names via _failed_function_name(), and classifies those functions as "failed" with verification error (...) before other classification.
Unlisted verifier-errors accounting
vera/tester.py
Computes TestSummary.unlisted_errors for verifier diagnostics not attributed to any displayed "failed" result.
CLI JSON and exit-code behaviour for verifier errors
vera/cli.py
cmd_test computes has_errors from result.diagnostics, requires s.failed == 0 AND not has_errors for JSON ok, adds unlisted_errors to JSON summary, and fails JSON and non-JSON exit codes when diagnostic errors exist.
CLI human-readable output and summary logic
vera/cli.py
Human output prints per-function f.reason for failed rows, a Diagnostics: block listing error codes and first description lines, and recomputes Results summary using static_failed, tested_failed, and unlisted_errors.
Unit and integration tests for verifier-fail semantics
tests/*
Adds Tier1/Tier3 unit tests and CLI integration tests asserting E500/E501/E502 cause "failed", preserve diagnostics (including private functions), enforce --fn filtering semantics, separate static vs Tier3 counts (E700 exclusion), avoid double-counting, and verify non-zero JSON subprocess exit.
Existing test updates and changelog/docs/metrics
tests/test_tester_coverage.py, CHANGELOG.md, KNOWN_ISSUES.md, README.md, TESTING.md, ROADMAP.md
Refinement-type alias test updated to expect E500 failure; CHANGELOG documents fail-closed behaviour and tests; KNOWN_ISSUES entry for #674 removed; README/ROADMAP/TESTING metrics updated.

Sequence Diagram

sequenceDiagram
  participant Verifier
  participant _TestEngine
  participant CLI
  Verifier-->>_TestEngine: emit diagnostics (E500/E501/E502)
  _TestEngine->>_TestEngine: filter severity=="error", extract function names, mark impacted functions "failed"
  _TestEngine-->>CLI: result (results, summary, diagnostics)
  CLI->>CLI: compute has_errors, s.failed, include unlisted_errors, determine exit code
  CLI-->>User: JSON/human output with Diagnostics block and non-zero exit where applicable
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

  • #681: Implements the same verifier-error classification and CLI/test updates for E500/E501/E502 handling.

Suggested labels

compiler, tests, docs

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly references issue #674 and accurately describes the main change: fixing vera test to properly handle verifier errors (E500/E501/E502) by failing closed.
Linked Issues check ✅ Passed The PR fully addresses #674's objectives: verifier diagnostics are preserved, functions with E500/E501/E502 errors are classified as failed, JSON ok and exit code reflect errors, and attribution rules are implemented.
Out of Scope Changes check ✅ Passed All changes are scoped to verifier error handling as specified in #674. Test count updates in documentation reflect only the newly added regression tests for this feature.
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_cli.py`:
- Around line 1625-1643: In test_dispatch_test_json_verifier_error_nonzero
update the call to prog.write_text(...) to explicitly pass encoding="utf-8" so
the test file is written with UTF-8 encoding; locate the prog.write_text
invocation in that test (the block creating "wrong_double.vera") and add the
encoding parameter to the write_text call.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9f3cc335-936f-4a9c-b6db-70dd6b95dcc6

📥 Commits

Reviewing files that changed from the base of the PR and between cdcaedc and ce5214f.

📒 Files selected for processing (6)
  • CHANGELOG.md
  • tests/test_cli.py
  • tests/test_tester.py
  • tests/test_tester_coverage.py
  • vera/cli.py
  • vera/tester.py

Comment thread tests/test_cli.py Outdated
Fail closed when `vera test` sees E500/E501/E502 verifier diagnostics instead of reporting verifier-refuted contracts as successful Tier 1 results.

Preserve verifier diagnostics in JSON, return ok=false with a nonzero exit, and show human diagnostics so private, unselected, or otherwise non-displayed verifier failures cannot disappear.

Attribute E501 call-site precondition failures to the caller while preserving E500/E502 attribution, keep --fn rows filtered while whole-file verifier errors still fail closed, and distinguish static verifier failures from Tier 3 E700 runtime failures in human summaries.

Add regression coverage for JSON and human output, private helper failures, --fn fail-closed diagnostics, E501 caller attribution, E502 underflow attribution, mixed static/Tier 3 summaries, E700 runtime failures, multiple diagnostics on one displayed failed function, and CLI dispatch return codes.
@rzyns rzyns force-pushed the fix-test-json-verifier-errors-674 branch from ce5214f to c7e41bf Compare May 18, 2026 20:40

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_cli.py`:
- Around line 1383-1399: The test creates .vera fixtures via the helper
_bad_vera which ultimately calls Path.write_text() without an explicit encoding,
making Windows runs brittle; update the fixture creation to use UTF-8 by either
modifying the helper _bad_vera to call write_text(..., encoding="utf-8") or
change the new test invocation to write the fixture with encoding="utf-8" when
creating the Path, ensuring the use of _bad_vera and write_text is the targeted
change.

In `@tests/test_tester.py`:
- Around line 90-93: The test currently asserts that needs_pos.category !=
"failed", which is too weak; change the assertion to require the callee remains
explicitly "verified" by replacing the line using _fn_result(result,
"needs_pos") to assert needs_pos.category == "verified" (keep the existing
assertion that caller.category == "failed"); update the assertion near the uses
of needs_pos, caller and _fn_result in tests/test_tester.py so the test proves
the regression only affects the caller.

In `@vera/tester.py`:
- Around line 428-431: The current assignment in the diagnostics loop overwrites
prior verifier codes for the same function (variable failed_fns) making the
visible reason order-dependent; change the logic in the block that uses
_failed_function_name(diag) so it does not clobber an existing entry for the
same name — either preserve the first hit (only set failed_fns[name] if name not
already present) or apply a deterministic priority (compare diag.error_code
against the existing failed_fns[name] using a defined priority/order such as
numeric lowest or an explicit list like ["E500","E501","E502"] and keep the
higher-priority code); update the assignment path that references diag.severity,
diag.error_code, and verification_error_codes so JSON and human outputs remain
stable and deterministic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f47f2226-5c59-4a56-a8bb-1a4d77f67d6f

📥 Commits

Reviewing files that changed from the base of the PR and between ce5214f and c7e41bf.

📒 Files selected for processing (7)
  • CHANGELOG.md
  • KNOWN_ISSUES.md
  • tests/test_cli.py
  • tests/test_tester.py
  • tests/test_tester_coverage.py
  • vera/cli.py
  • vera/tester.py
💤 Files with no reviewable changes (1)
  • KNOWN_ISSUES.md

Comment thread tests/test_cli.py
Comment thread tests/test_tester.py
Comment thread vera/tester.py Outdated
@codecov

codecov Bot commented May 19, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 98.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 91.07%. Comparing base (cdcaedc) to head (a7b0c61).

Files with missing lines Patch % Lines
vera/tester.py 96.87% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #685      +/-   ##
==========================================
+ Coverage   91.02%   91.07%   +0.04%     
==========================================
  Files          60       60              
  Lines       23413    23458      +45     
  Branches      259      259              
==========================================
+ Hits        21312    21364      +52     
+ Misses       2094     2087       -7     
  Partials        7        7              
Flag Coverage Δ
javascript 57.36% <ø> (ø)
python 94.82% <98.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

CodeRabbit posted three findings on rzyns's PR plus there's a CI
gate (doc-counts) and a project convention (empty Bugs table) to
address.  Two additional minor refactors (DRY constant, public
helper rename) included since they touch the same files.

**CR Finding 1: _bad_vera missing encoding="utf-8"** (test
portability).  Pre-existing helper used 30+ times in test_cli.py
since long before aallan#685, but CR caught it now because rzyns's new
tests added more callers.  Windows runs would fall back to cp1252
on the default text-mode encoding and corrupt UTF-8 sequences in
Vera fixtures.  CLAUDE.md "Cross-platform pitfalls" already flags
this as a convention.  One-line fix on the helper benefits every
caller.

**CR Finding 2: needs_pos.category != "failed" too weak**
(test_tester.py).  In `test_call_precondition_error_fails_caller_
not_callee`, the negative assertion would still pass if needs_pos
were classified as "skipped" or any other non-failed bucket.
Strengthened to `== "verified"` so the test actually pins that the
callee is correctly proven, not just that it avoided the failed
bucket.

**CR Finding 3: failed_fns dict overwrite is order-dependent**
(vera/tester.py).  When a single function attracts multiple
verifier errors (e.g. `ensures(false)` produces E500 AND
`@Nat - @Nat` produces E502 on the same body — the
`test_displayed_failed_function_does_not_double_count_verifier_
errors` test exercises exactly this shape), the loop overwrites
the dict entry and the displayed `reason` flips with diagnostic
iteration order.  Switched to first-hit (`if name not in
failed_fns`) — `setdefault`-style semantics, no judgement call on
which code "should" win.

**Doc counts: 7 stale numbers** (CI gate would fail).
`check_doc_counts.py` flagged TESTING.md total, test_cli.py +
test_tester.py + test_tester_coverage.py rows, and ROADMAP.md
test count.  Updated all 7 plus README.md test count for
consistency: 3,905 → 3,916 total.  All counts now consistent.

**KNOWN_ISSUES.md empty table** (project convention).  PR
removed the aallan#674 row but left the table header `| Bug | Issue |`
with no body.  Per the convention established at aallan#673's release
("retain the Bugs section, and just have some text in it to say
'No known bugs'"), replaced the empty table with the convention
line.

**Nit 1: DRY** — `{"E500", "E501", "E502"}` was duplicated
between vera/tester.py and vera/cli.py.  Extracted as
`VERIFICATION_ERROR_CODES = frozenset(...)` module-level in
vera/tester.py; cli.py imports it.  `frozenset` over plain set so
the constant is immutable (matches its semantic role as a
classifier-set constant).

**Nit 2: private name reaching across modules** — `cli.py` was
importing `_failed_function_name` from vera/tester.py.  The
single-underscore prefix signals "private by convention" but
cli.py's reliance on it was a leak.  Renamed
`_failed_function_name` → `failed_function_name` (public);
docstring updated to mention the cli.py call site explicitly.

Validation:
- pytest 3,886 / 3,886 (no test count change — the renames don't
  affect test behaviour, just the names)
- mypy clean
- ruff S clean
- conformance 86/86, examples 34/34
- check_doc_counts.py consistent
- limitations sync consistent
- check_changelog_updated.py passes (existing [Unreleased] entry
  from rzyns's commit + the additional carry-over from PR aallan#684)
- check_site_assets.py up-to-date

Co-Authored-By: Claude <noreply@anthropic.invalid>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_cli.py`:
- Around line 1479-1508: The test
test_fn_filter_surfaces_unselected_verifier_error currently only exercises the
JSON output path; add a sibling assertion that exercises the human/text path by
calling cmd_test(str(path), fn_name="identity") (without as_json) and assert it
returns exit code 1, then use capsys.readouterr().out to assert the verifier
error is present in the textual diagnostics, that only the selected function
"identity" is listed, and that its category is reported as "verified" (mirror
the JSON checks but against the text output) to ensure the fail-closed behavior
on the text branch is covered.

In `@vera/tester.py`:
- Around line 90-95: VERIFICATION_ERROR_CODES and failed_function_name() are
being exported for CLI-only bookkeeping; make them private (rename to
_VERIFICATION_ERROR_CODES and _failed_function_name) or move their logic into a
single shared helper used only by the CLI, and instead expose structured
attribution/deduplication data on the TestResult/TestSummary returned by test()
(e.g., attach a list of verifier error codes and failing-function attribution to
TestResult/TestSummary) so the CLI parses structured fields rather than relying
on regex helpers; update test(), TestResult/TestSummary, and any callers
(including the CLI consumer) to consume the new structured fields and remove the
public exports.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2136ebdf-5816-4f18-b1ae-62b3d7b2fec2

📥 Commits

Reviewing files that changed from the base of the PR and between c7e41bf and 8d12298.

📒 Files selected for processing (8)
  • KNOWN_ISSUES.md
  • README.md
  • ROADMAP.md
  • TESTING.md
  • tests/test_cli.py
  • tests/test_tester.py
  • vera/cli.py
  • vera/tester.py

Comment thread tests/test_cli.py
Comment thread vera/tester.py
aallan
aallan previously approved these changes May 19, 2026

@aallan aallan left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rzyns — this is a fine fix. Approving once CI clears.

A few notes on what's here, for anyone looking at the PR:

Correctness. The classifier change in vera/tester.py is right: verifier-refuted contracts now take precedence over the verified/tier3/skipped buckets, so a function the verifier can refute statically can't fall through to a green-test result. The E501 attribution to the caller (not the callee) is the right call — the missing-precondition obligation lives at the caller, and that's where the user needs to add the requires(...) to make the verifier accept the call. The --fn filter logic is also nicely handled: filtered output stays scoped to the selected function, but whole-file verifier errors still fail closed (otherwise vera test --fn safe_fn buggy_file.vera would false-green).

Tests. Eleven new regression tests across test_tester.py and test_cli.py, each pinning a distinct property — JSON output, human output, private function visibility, --fn filter behaviour, summary split, Tier-3-vs-static distinction, double-count avoidance, dispatcher contract. The double-count test (function attracting both E500 and E502 on the same body via ensures(false) + @Nat - @Nat) was an edge case I'd have missed.

Cleanup commit (8d12298). I pushed a follow-up handling the three CR comments that arrived later — _bad_vera encoding (now encoding="utf-8" — benefits 30+ callers, not just the new tests), the needs_pos.category != "failed"== "verified" assertion strengthening (the negative form would still pass under "skipped"), and the failed_fns dict-clobber fix (first-hit semantics, so multi-diagnostic functions get a deterministic reason instead of one that flips with iteration order). Also rolled in a couple of nits: extracted VERIFICATION_ERROR_CODES as a module-level frozenset constant (was duplicated between tester.py and cli.py), and dropped the underscore from failed_function_name since cli.py imports it across modules. Plus the doc-counts CI gate (7 stale numbers in TESTING.md / ROADMAP.md / README.md after the test count went 3,905 → 3,916) and the | Bug | Issue | empty-table convention in KNOWN_ISSUES.md.

Thanks for the well-shaped contributions across #674, #675, and this one.

CR's two new comments after my cleanup commit:

**Finding 1: missing text-mode sibling for the --fn filter test**
(test_cli.py).  `test_fn_filter_surfaces_unselected_verifier_
error` covered the JSON path but not the human/text path — both
codepaths can drift independently (text rendering has its own
diagnostics section, summary line, unlisted-error counter), so
the JSON-only coverage left a real gap.

Added `test_fn_filter_surfaces_unselected_verifier_error_text`
mirroring the JSON test for human output: asserts the displayed
function list stays scoped to `identity` (per `--fn`), `broken`'s
verifier error still surfaces in the Diagnostics section, the
summary line names the unlisted verifier error, and the exit
code is 1.

**Finding 2: classifier internals exported for CLI-only use**
(vera/tester.py).  Last round I dropped the underscore from
`_failed_function_name` so cli.py could import it across the
module boundary.  CR pushed back: those exports (along with
`VERIFICATION_ERROR_CODES`) are CLI-only bookkeeping and don't
belong on `vera.tester`'s public surface.

Three options from CR:
- (a) Privatise the exports — cli.py keeps reaching across the
  underscore boundary
- (b) Move the logic into a CLI-only shared helper
- (c) Expose the computed value as structured data on
  TestResult/TestSummary so cli.py reads it as a field

Went with (c) — it's the cleanest because the only thing cli.py
needed the exports for was a single count (verifier errors
whose target function isn't in the displayed list).  Adding
`unlisted_errors: int = 0` to TestSummary, computing it in the
engine after the main loop, and reading it as
`s.unlisted_errors` in cli.py makes vera/cli.py purely
presentational — no logic, no regex, no classifier internals.
The engine returns the canonical answer and the formatter just
renders it.

Specifically:
- `vera/tester.py::TestSummary` gains `unlisted_errors: int = 0`
- `vera/tester.py::_TestEngine.run()` computes
  `summary.unlisted_errors` before returning by counting
  verifier-error diagnostics whose attributable function isn't
  in the `displayed_failed` set derived from `results`
- `VERIFICATION_ERROR_CODES` → `_VERIFICATION_ERROR_CODES`
  (private; only the engine uses it now)
- `failed_function_name` → `_failed_function_name` (private;
  only the engine uses it now)
- `vera/cli.py` drops the imports + the local `displayed_failed`
  set + the `sum(...)` comprehension; just reads
  `s.unlisted_errors` directly
- JSON output gains `"unlisted_errors": s.unlisted_errors` in
  the summary dict for downstream CI consumers
- `test_fn_filter_surfaces_unselected_verifier_error` (JSON)
  now also asserts `data["summary"]["unlisted_errors"] == 1`
  to pin the new structured field

This is option (c) — the principled long-term shape.  Option
(a) (which I did last round) was the minimum-touch fix but
left `vera.tester`'s public API leaking implementation details
the CLI happened to need.  CR is right that (c) is the better
contract.

Validation: pytest 3,887 / 3,887 (one new test for Finding 1),
mypy clean, ruff S clean, conformance 86/86, examples 34/34,
doc-counts consistent (3,916 → 3,917 total; test_cli.py
225 → 226 tests, 3,250 → 3,300 lines).

Co-Authored-By: Claude <noreply@anthropic.invalid>
@aallan

aallan commented May 19, 2026

Copy link
Copy Markdown
Owner

Cleanup commit. (a7b0c61) — round-2 CR findings: text-mode sibling for the --fn filter test, and restructuring the public API. CR (correctly) pushed back on the round-1 rename: those classifier internals were CLI-only bookkeeping that didn't belong on vera.tester's public surface. Replaced with TestSummary.unlisted_errors as structured data, the engine computes the value, cli.py reads it. _VERIFICATION_ERROR_CODES and _failed_function_name are private again. JSON output gains the new field so downstream CI consumers read structured data instead of re-running attribution on the diagnostics array.

@aallan aallan merged commit d073749 into aallan:main May 19, 2026
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

vera test --json reports verified/ok for a postcondition failure caught by vera verify --json

2 participants