feat: vera/speculativeEdit proof-delta — completes the LSP server (#222 Phase E) by aallan · Pull Request #720 · aallan/vera

aallan · 2026-06-10T23:29:58Z

Summary

Phase E — the final phase of the #222 plan: vera/speculativeEdit, the one custom LSP method and the reason the obligation core was built first. Apply an edit in memory, re-verify on the warm incremental session, and return a proof_delta telling an agent whether the edit keeps, breaks, or strengthens the program's proofs — before committing it. The latency-to-confidence loop from the issue's 2026-04-20 design notes, delivered.

With this merge, all five phases of the plan have shipped across v0.0.161–v0.0.165:

Phase	Release	What
A	v0.0.161	reified `ProofObligation`s + warm Z3 session (#716)
B	v0.0.162	incremental invalidation + discharge cache (#717)
C	v0.0.163	`vera lsp` stdio transport + coordinate layer (#718)
D	v0.0.164	diagnostics + tier hints, hover, slot goto, hole completion (#719)
E	v0.0.165	`vera/speculativeEdit` proof-delta (this PR)

What's in it

vera/lsp/extensions.py: proof_delta — a pure set-difference over Phase A obligation identities (content_key), bucketing into newly_discharged / newly_undischarged / timed_out / removed / unchanged, each item carrying fn/kind/expr/position and before/after status. The keeps/drops-Tier-1 signal deferred from Phase D lands here naturally: verified → tier3 is a drop. speculative_edit verifies a copy; parse/type errors report ok: false with the error count ("this edit doesn't even compile" is itself the answer an agent needs). The removed bucket is additive to the design notes' four — obligations deleted by the edit would otherwise vanish silently.
Isolation: speculative runs share the warm session and discharge cache (deliberate pre-warming — a speculative state that later becomes real replays from cache) under the same lock, but never touch the per-URI analyses or published diagnostics. Pinned by a test that re-verifies the original source after a speculative edit and observes full cache replay.
Server wiring: vera/speculativeEdit registered; params are plain JSON {uri, text} (full-text, matching the sync model).

Closing sweep (final-phase duties)

vera/README.md: the "No LSP server" row leaves Current Limitations; the module map gains obligations/ (637 lines) and lsp/ (829 lines) with per-file rows.
ROADMAP.md: LSP server #222 rows deleted from the agent-integration tier and the Phase 3a milestone (completed items move to HISTORY, per convention).
CHANGELOG 0.0.165 summarises the full A–E arc; HISTORY.md one-liner; version bump across all tracked sites; uv.lock; site assets.

Verification

tests/test_lsp.py 45 → 52: identical-text → all-unchanged; breaking edit surfaces newly_undischarged (the violated nat_sub) with canonical state untouched; strengthening edit surfaces newly_discharged; parse/type errors → ok: false; deleted function → removed; proof_delta purity.
mypy vera/ strict clean (69 files); ruff S clean; LSP + obligations suites green; full non-stress suite green via pre-commit; doc counts consistent (4,272 tests); version sync 6/6.

Closes #222

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added speculative edit capability to the LSP server, allowing analysis of proposed source changes.
- Integrated proof-delta tracking to report obligation status changes during speculative edits.
Tests
- Expanded test suite with comprehensive coverage for speculative edit scenarios and proof-delta operations.

… Phase E) The one custom LSP method, and the payoff of building the obligation core first: apply an edit in memory, re-verify on the warm incremental session, and return a proof_delta (newly_discharged / newly_undischarged / timed_out / removed / unchanged) computed as a set-difference over Phase A obligation identities. An agent proposing an edit learns whether it keeps, breaks, or strengthens the program's proofs before committing it — the latency-to-confidence loop from the #222 design notes, and the keeps/drops-Tier-1 signal deferred from Phase D lands here naturally (verified -> tier3 is a drop). vera/lsp/extensions.py: proof_delta (pure set-difference) + speculative_edit (verify a copy; parse/type errors report ok:false with the error count — 'this edit doesn't even compile' is itself the answer). Speculative runs share the warm session and discharge cache (pre-warming: a speculative state that becomes real replays from cache) under the same lock, but never touch per-URI analyses or published diagnostics — canonical editor state is untouched, pinned by a test that re-verifies the original source afterwards and observes full cache replay. server.py: vera/speculativeEdit registered; params accepted as plain JSON ({uri, text}). Closing sweep for the #222 plan (this is the final phase): - vera/README.md: 'No LSP server' row deleted from Current Limitations; obligations/ and lsp/ packages added to the module map. - ROADMAP.md: #222 rows deleted from the agent-integration tier and the Phase 3a milestone (completed items move to HISTORY, per convention). - HISTORY.md v0.0.165 row; CHANGELOG 0.0.165 cut summarising the full A-E arc (v0.0.161-v0.0.165); version bump across all tracked sites; uv.lock; site assets. Tests (tests/test_lsp.py 45 -> 52): identical-text all-unchanged; breaking edit surfaces newly_undischarged (violated nat_sub) with canonical state untouched; strengthening edit surfaces newly_discharged; parse/type errors report ok:false; deleted functions report removed; proof_delta purity. Validation: mypy strict clean (69 files), ruff S clean, LSP + obligations suites green, doc counts consistent (4,272), version sync 6/6, site assets current. Part of #222 (Phase E — the PR carries the closing footer). Co-Authored-By: Claude <noreply@anthropic.invalid>

coderabbitai · 2026-06-10T23:30:08Z

📝 Walkthrough

Walkthrough

Phase E adds speculative edit verification to the Vera LSP server. A new vera/lsp/extensions.py module implements proof-delta computation—comparing baseline and speculative obligation sets by content-key identity to categorise newly discharged, newly undischarged, unchanged, timed-out, and removed obligations. The LSP handler wires this into create_server(), and comprehensive tests validate all branches: identical edits, precondition weakening/strengthening, parse/type errors, function deletion, and proof-delta purity semantics.

Changes

Speculative Edit Feature with Proof-Delta

Layer / File(s)	Summary
Proof-delta and speculative-edit logic `vera/lsp/extensions.py`	Obligation set-difference via `proof_delta()` using content-key identity; `_item()` serialisation helper; `speculative_edit()` invoking `session.verify_source()`, handling parse/transform/type-error exceptions, and returning structured results with proof-delta and diagnostic counts.
LSP server handler integration `vera/lsp/server.py`	Imports `speculative_edit` and registers the `vera/speculativeEdit` method handler within `create_server()`, extracting request parameters, loading baseline from cached analysis, invoking the extension under `analysis_lock`, and returning results.
Comprehensive Phase E test suite `tests/test_lsp.py`	`TestSpeculativeEdit` with `_baseline()` helper; tests for unchanged obligations, newly undischarged (weakened preconditions), newly discharged (strengthened preconditions), parse/type errors, function deletion, and proof-delta set-difference semantics over identity keys.
Version and documentation updates `pyproject.toml`, `vera/__init__.py`, `CHANGELOG.md`, `HISTORY.md`, `README.md`, `ROADMAP.md`, `TESTING.md`, `vera/README.md`	Bump to v0.0.165; document Phase E and proof-delta in changelog; update release history (Stage 12, 165 tagged releases, 62 active development days); increment test counts to 4,272 total and 4,241 passed; remove completed "LSP server" roadmap entries ([`#222`]); expand Module Map to document obligations and lsp subsystems; update TESTING.md coverage for test_lsp.py scope.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

The feature introduces two new export functions with non-trivial logic (proof-delta set-difference, speculative-edit error handling), one server integration point, and a solid test suite covering multiple scenarios. Mixed complexity across obligation identity tracking, exception handling, and test orchestration; straightforward LSP wiring.

Possibly related PRs

aallan/vera#718: Builds on Phase C LSP server scaffolding and test structure; extends create_server() feature-handler wiring and enlarges tests/test_lsp.py with Phase E speculative-edit coverage.

Suggested labels

compiler, tests, docs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 46.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: implementing the vera/speculativeEdit proof-delta feature that completes Phase E of the LSP server.
Linked Issues check	✅ Passed	Phase E (speculative edit with proof-delta) is the final piece completing LSP server (`#222`). All five feature objectives—Diagnostics, Completion, Go-to-definition, Hover, Find references—are either pre-existing or scaffolded here.
Out of Scope Changes check	✅ Passed	All changes directly support Phase E completion: speculative_edit and proof_delta implementations, LSP server wiring, test coverage, version bumps, and documentation (CHANGELOG, HISTORY, README, ROADMAP, TESTING) reflecting v0.0.165 release.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/222-phase-e-speculative-edit

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-06-10T23:31:36Z

Codecov Report

❌ Patch coverage is 85.41667% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.95%. Comparing base (5324e8c) to head (6520ac6).

Files with missing lines	Patch %	Lines
vera/lsp/server.py	33.33%	6 Missing ⚠️
vera/lsp/extensions.py	97.29%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #720      +/-   ##
==========================================
- Coverage   90.96%   90.95%   -0.01%     
==========================================
  Files          68       69       +1     
  Lines       24630    24676      +46     
  Branches      292      292              
==========================================
+ Hits        22404    22444      +40     
- Misses       2219     2225       +6     
  Partials        7        7

Flag	Coverage Δ
javascript	`61.40% <ø> (ø)`
python	`94.62% <85.41%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@vera/README.md`:
- Around line 115-117: The markdown rows include inline code spans with
leading/trailing spaces (e.g. "`  core.py`", "`  cache.py`", "`  session.py`")
which trigger MD038; remove the inner padding so the spans are backticked
without spaces (e.g. change "`  core.py`" to "`core.py`") and apply the same
change to the other affected rows (including the additional occurrences
mentioned around lines 119-123) to satisfy markdownlint.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f3da4af4-7f7b-47d9-a9b3-df4b125e9381

📥 Commits

Reviewing files that changed from the base of the PR and between 5324e8c and 6520ac6.

⛔ Files ignored due to path filters (6)

docs/index.html is excluded by !docs/**
docs/index.md is excluded by !docs/**
docs/llms-full.txt is excluded by !docs/**
docs/llms.txt is excluded by !docs/**
docs/sitemap.xml is excluded by !docs/**
uv.lock is excluded by !**/*.lock, !uv.lock

📒 Files selected for processing (11)

CHANGELOG.md
HISTORY.md
README.md
ROADMAP.md
TESTING.md
pyproject.toml
tests/test_lsp.py
vera/README.md
vera/__init__.py
vera/lsp/extensions.py
vera/lsp/server.py

coderabbitai · 2026-06-10T23:38:07Z

+| `  core.py` | 109 | | ProofObligation record: identity (content_key) + discharge outcome | |
+| `  cache.py` | 219 | | Invalidation keys (structural/callee/context hashes), DischargeCache | |
+| `  session.py` | 252 | | Warm-Z3 daemon: per-function replay vs re-verify in declaration order | |


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove inner padding in inline code spans to satisfy markdownlint MD038.

These rows use code spans with leading/trailing spaces (for example ` core.py`), which triggers MD038 and can fail lint gates.

Suggested fix

-| ` core.py` | 109 | | ProofObligation record: identity (content_key) + discharge outcome | | -| ` cache.py` | 219 | | Invalidation keys (structural/callee/context hashes), DischargeCache | | -| ` session.py` | 252 | | Warm-Z3 daemon: per-function replay vs re-verify in declaration order | | +| `core.py` | 109 | | ProofObligation record: identity (content_key) + discharge outcome | | +| `cache.py` | 219 | | Invalidation keys (structural/callee/context hashes), DischargeCache | | +| `session.py` | 252 | | Warm-Z3 daemon: per-function replay vs re-verify in declaration order | | -| ` convert.py` | 144 | | Span/SourceLocation/LSP coordinate conversions, UTF-16 transcoding | | -| ` documents.py` | 69 | | URI-keyed document store, full-text sync | | -| ` features.py` | 292 | | Diagnostics + tier hints, hover, slot goto, hole completion | | -| ` extensions.py` | 146 | | vera/speculativeEdit proof-delta | | -| ` server.py` | 169 | | pygls wiring, single-session serialisation | | +| `convert.py` | 144 | | Span/SourceLocation/LSP coordinate conversions, UTF-16 transcoding | | +| `documents.py` | 69 | | URI-keyed document store, full-text sync | | +| `features.py` | 292 | | Diagnostics + tier hints, hover, slot goto, hole completion | | +| `extensions.py` | 146 | | vera/speculativeEdit proof-delta | | +| `server.py` | 169 | | pygls wiring, single-session serialisation | |

Also applies to: 119-123

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 115-115: Spaces inside code span elements

(MD038, no-space-in-code)

[warning] 116-116: Spaces inside code span elements

(MD038, no-space-in-code)

[warning] 117-117: Spaces inside code span elements

(MD038, no-space-in-code)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@vera/README.md` around lines 115 - 117, The markdown rows include inline code spans with leading/trailing spaces (e.g. "` core.py`", "` cache.py`", "` session.py`") which trigger MD038; remove the inner padding so the spans are backticked without spaces (e.g. change "` core.py`" to "`core.py`") and apply the same change to the other affected rows (including the additional occurrences mentioned around lines 119-123) to satisfy markdownlint.

Source: Linters/SAST tools

@coderabbitai Declining this one: the leading-space code-span style is the file's established sub-row convention — the pre-existing checker/ section (lines 84-90, untouched by this PR) uses exactly core.py etc., so the new obligations/ and lsp/ rows deliberately match it. There is no markdownlint gate in this repo's CI or pre-commit, and changing only the new rows would break the table's visual consistency; a repo-wide MD038 cleanup would be its own change.

@aallan, understood — the indented code-span style is the file's established sub-row convention, matching the pre-existing checker/ entries, there is no MD038 gate in CI, and a repo-wide cleanup would be a separate concern. Withdrawing the comment.

✏️ Learnings added

Learnt from: aallan Repo: aallan/vera PR: 720 File: vera/README.md:115-117 Timestamp: 2026-06-11T00:03:34.481Z Learning: In `vera/README.md`, the Module Map table uses leading-space code spans (e.g. `` ` core.py` ``) as the established visual convention for sub-module rows (indented entries under a package row). This is intentional and consistent with the pre-existing `checker/` section. Do not flag these as MD038 violations; there is no markdownlint gate in the repo's CI or pre-commit configuration, and changing only new rows would break visual consistency with existing ones.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: aallan Repo: aallan/vera PR: 571 File: HISTORY.md:301-301 Timestamp: 2026-05-05T17:09:21.991Z Learning: In the aallan/vera repo documentation, compute the canonical test count using pytest collection, not by line-counting source code. Use `pytest tests/ --collect-only -q` (or `python scripts/check_doc_counts.py`) as the source of truth. Do not verify “number of tests” by counting `def test_` / `class Test` lines with ripgrep (`rg`) or by shell-based line counts, because parametrized/fixture-driven tests can produce multiple collected test items from a single `def` line and this will undercount.

Learnt from: aallan Repo: aallan/vera PR: 577 File: HISTORY.md:301-311 Timestamp: 2026-05-06T08:33:56.675Z Learning: Do not trust pytest collection counts from the CodeRabbit sandbox when reporting test counts in Markdown docs (e.g., README.md, HISTORY.md, TESTING.md, ROADMAP.md). The sandbox lacks the full Python dependencies and yields inaccurate counts. Verify counts in the real development environment by running 'pytest tests/ --collect-only -q' or by using 'scripts/check_doc_counts.py', and base the documented numbers on that output. Apply this guidance broadly to Markdown docs in the repo, following the canonical approach referenced by learning 3b066ed3-0d66-41bd-a058-ccdf04052082.

Learnt from: aallan Repo: aallan/vera PR: 613 File: ROADMAP.md:7-7 Timestamp: 2026-05-07T20:05:33.932Z Learning: When reviewing Markdown documentation in aallan/vera that involves numeric count/summary verification (including tests, conformance programs, examples, and spec chapters), treat `python scripts/check_doc_counts.py` as the single authoritative source of truth. Do not use cross-references against `README.md` or other Markdown files to validate counts, since they may be temporarily stale during PR series. If you see a count discrepancy detected via `rg`/`grep` against `README.md`, treat it as non-actionable; only fail/flag the issue when `check_doc_counts.py` reports an error.

Learnt from: aallan Repo: aallan/vera PR: 662 File: HISTORY.md:292-293 Timestamp: 2026-05-12T08:14:28.532Z Learning: In aallan/vera, ensure the “tagged release” count stays perfectly in sync across exactly two locations per release: (1) the roll-up footer total in HISTORY.md (the “N tagged releases” line) and (2) the “N releases” figure in README.md (around line ~184 in the project-status sentence). ROADMAP.md intentionally does not include this counter. During PR review, flag changes that update only one of these two designated locations without updating the other to the same tagged-release count.

Learnt from: aallan Repo: aallan/vera PR: 689 File: TESTING.md:0-0 Timestamp: 2026-05-19T11:05:07.302Z Learning: When reviewing Vera documentation Markdown (e.g., TESTING.md), don’t update only the instance shown in the diff hunk for stale numeric count changes. Instead, treat the old count as a value that must be replaced everywhere in the same file: search the entire file for all literal occurrences of the old number (including table cells, overview/feature-coverage sections, round-trip sections, pre-commit/CI tables, and any other descriptive text) and ensure every occurrence is updated consistently as part of the same change. Use a whole-file search (e.g., grep-style patterns such as `34 example|all 34|All 34` or an equivalent literal-number match) to confirm no old-count references remain before commenting/approving.

Learnt from: aallan Repo: aallan/vera PR: 689 File: CHANGELOG.md:0-0 Timestamp: 2026-05-19T12:34:26.776Z Learning: When documenting `IO.read_char` behavior in the aallan/vera docs for Unix TTY inputs, explicitly state that it uses termios *cbreak* mode via `tty.setcbreak()` (so `ISIG` remains enabled and Ctrl-C works), and that it should *not* use raw mode. Also document that in the Unix TTY branch, Ctrl-D maps to EOF, while the non-TTY path reads a single byte using `sys.stdin.read(1)`.

Learnt from: aallan Repo: aallan/vera PR: 697 File: ROADMAP.md:0-0 Timestamp: 2026-05-26T16:23:53.102Z Learning: In Markdown documentation, do not accept citation/attribution URLs as valid based solely on plausible formatting. For external attribution links (e.g., tweets/X posts, blog posts, arXiv papers, GitHub permalinks, etc.), verify that the linked content is publicly accessible and matches the claimed author/date/title. If the source cannot be confirmed as accessible/accurate, flag the link as “unverifiable” and recommend replacing it with a confirmed URL or removing the URL while keeping the attribution text (e.g., “Szymon Teżewski (jasisz1)’s 24 May 2026 tweet”). Rule of thumb: verify or omit—never fabricate or assume.

coderabbitai Bot reviewed Jun 10, 2026

View reviewed changes

aallan merged commit d2e11ae into main Jun 11, 2026
27 checks passed

aallan deleted the feat/222-phase-e-speculative-edit branch June 11, 2026 00:19

aallan mentioned this pull request Jun 11, 2026

LSP server #222

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: vera/speculativeEdit proof-delta — completes the LSP server (#222 Phase E)#720

feat: vera/speculativeEdit proof-delta — completes the LSP server (#222 Phase E)#720
aallan merged 1 commit into
mainfrom
feat/222-phase-e-speculative-edit

aallan commented Jun 10, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

codecov Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 10, 2026 •

edited

Loading

Uh oh!

aallan Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aallan commented Jun 10, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in it

Closing sweep (final-phase duties)

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

codecov Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aallan Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aallan commented Jun 10, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading

codecov Bot commented Jun 10, 2026 •

edited

Loading

coderabbitai Bot Jun 10, 2026 •

edited

Loading