feat: vera/speculativeEdit proof-delta — completes the LSP server (#222 Phase E)#720
Conversation
… Phase E) The one custom LSP method, and the payoff of building the obligation core first: apply an edit in memory, re-verify on the warm incremental session, and return a proof_delta (newly_discharged / newly_undischarged / timed_out / removed / unchanged) computed as a set-difference over Phase A obligation identities. An agent proposing an edit learns whether it keeps, breaks, or strengthens the program's proofs before committing it — the latency-to-confidence loop from the #222 design notes, and the keeps/drops-Tier-1 signal deferred from Phase D lands here naturally (verified -> tier3 is a drop). vera/lsp/extensions.py: proof_delta (pure set-difference) + speculative_edit (verify a copy; parse/type errors report ok:false with the error count — 'this edit doesn't even compile' is itself the answer). Speculative runs share the warm session and discharge cache (pre-warming: a speculative state that becomes real replays from cache) under the same lock, but never touch per-URI analyses or published diagnostics — canonical editor state is untouched, pinned by a test that re-verifies the original source afterwards and observes full cache replay. server.py: vera/speculativeEdit registered; params accepted as plain JSON ({uri, text}). Closing sweep for the #222 plan (this is the final phase): - vera/README.md: 'No LSP server' row deleted from Current Limitations; obligations/ and lsp/ packages added to the module map. - ROADMAP.md: #222 rows deleted from the agent-integration tier and the Phase 3a milestone (completed items move to HISTORY, per convention). - HISTORY.md v0.0.165 row; CHANGELOG 0.0.165 cut summarising the full A-E arc (v0.0.161-v0.0.165); version bump across all tracked sites; uv.lock; site assets. Tests (tests/test_lsp.py 45 -> 52): identical-text all-unchanged; breaking edit surfaces newly_undischarged (violated nat_sub) with canonical state untouched; strengthening edit surfaces newly_discharged; parse/type errors report ok:false; deleted functions report removed; proof_delta purity. Validation: mypy strict clean (69 files), ruff S clean, LSP + obligations suites green, doc counts consistent (4,272), version sync 6/6, site assets current. Part of #222 (Phase E — the PR carries the closing footer). Co-Authored-By: Claude <noreply@anthropic.invalid>
📝 WalkthroughWalkthroughPhase E adds speculative edit verification to the Vera LSP server. A new ChangesSpeculative Edit Feature with Proof-Delta
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes The feature introduces two new export functions with non-trivial logic (proof-delta set-difference, speculative-edit error handling), one server integration point, and a solid test suite covering multiple scenarios. Mixed complexity across obligation identity tracking, exception handling, and test orchestration; straightforward LSP wiring. Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #720 +/- ##
==========================================
- Coverage 90.96% 90.95% -0.01%
==========================================
Files 68 69 +1
Lines 24630 24676 +46
Branches 292 292
==========================================
+ Hits 22404 22444 +40
- Misses 2219 2225 +6
Partials 7 7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@vera/README.md`:
- Around line 115-117: The markdown rows include inline code spans with
leading/trailing spaces (e.g. "` core.py`", "` cache.py`", "` session.py`")
which trigger MD038; remove the inner padding so the spans are backticked
without spaces (e.g. change "` core.py`" to "`core.py`") and apply the same
change to the other affected rows (including the additional occurrences
mentioned around lines 119-123) to satisfy markdownlint.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f3da4af4-7f7b-47d9-a9b3-df4b125e9381
⛔ Files ignored due to path filters (6)
docs/index.htmlis excluded by!docs/**docs/index.mdis excluded by!docs/**docs/llms-full.txtis excluded by!docs/**docs/llms.txtis excluded by!docs/**docs/sitemap.xmlis excluded by!docs/**uv.lockis excluded by!**/*.lock,!uv.lock
📒 Files selected for processing (11)
CHANGELOG.mdHISTORY.mdREADME.mdROADMAP.mdTESTING.mdpyproject.tomltests/test_lsp.pyvera/README.mdvera/__init__.pyvera/lsp/extensions.pyvera/lsp/server.py
| | ` core.py` | 109 | | ProofObligation record: identity (content_key) + discharge outcome | | | ||
| | ` cache.py` | 219 | | Invalidation keys (structural/callee/context hashes), DischargeCache | | | ||
| | ` session.py` | 252 | | Warm-Z3 daemon: per-function replay vs re-verify in declaration order | | |
There was a problem hiding this comment.
Remove inner padding in inline code spans to satisfy markdownlint MD038.
These rows use code spans with leading/trailing spaces (for example ` core.py`), which triggers MD038 and can fail lint gates.
Suggested fix
-| ` core.py` | 109 | | ProofObligation record: identity (content_key) + discharge outcome | |
-| ` cache.py` | 219 | | Invalidation keys (structural/callee/context hashes), DischargeCache | |
-| ` session.py` | 252 | | Warm-Z3 daemon: per-function replay vs re-verify in declaration order | |
+| `core.py` | 109 | | ProofObligation record: identity (content_key) + discharge outcome | |
+| `cache.py` | 219 | | Invalidation keys (structural/callee/context hashes), DischargeCache | |
+| `session.py` | 252 | | Warm-Z3 daemon: per-function replay vs re-verify in declaration order | |
-| ` convert.py` | 144 | | Span/SourceLocation/LSP coordinate conversions, UTF-16 transcoding | |
-| ` documents.py` | 69 | | URI-keyed document store, full-text sync | |
-| ` features.py` | 292 | | Diagnostics + tier hints, hover, slot goto, hole completion | |
-| ` extensions.py` | 146 | | vera/speculativeEdit proof-delta | |
-| ` server.py` | 169 | | pygls wiring, single-session serialisation | |
+| `convert.py` | 144 | | Span/SourceLocation/LSP coordinate conversions, UTF-16 transcoding | |
+| `documents.py` | 69 | | URI-keyed document store, full-text sync | |
+| `features.py` | 292 | | Diagnostics + tier hints, hover, slot goto, hole completion | |
+| `extensions.py` | 146 | | vera/speculativeEdit proof-delta | |
+| `server.py` | 169 | | pygls wiring, single-session serialisation | |Also applies to: 119-123
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 115-115: Spaces inside code span elements
(MD038, no-space-in-code)
[warning] 116-116: Spaces inside code span elements
(MD038, no-space-in-code)
[warning] 117-117: Spaces inside code span elements
(MD038, no-space-in-code)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@vera/README.md` around lines 115 - 117, The markdown rows include inline code
spans with leading/trailing spaces (e.g. "` core.py`", "` cache.py`", "`
session.py`") which trigger MD038; remove the inner padding so the spans are
backticked without spaces (e.g. change "` core.py`" to "`core.py`") and apply
the same change to the other affected rows (including the additional occurrences
mentioned around lines 119-123) to satisfy markdownlint.
Source: Linters/SAST tools
There was a problem hiding this comment.
@coderabbitai Declining this one: the leading-space code-span style is the file's established sub-row convention — the pre-existing checker/ section (lines 84-90, untouched by this PR) uses exactly core.py etc., so the new obligations/ and lsp/ rows deliberately match it. There is no markdownlint gate in this repo's CI or pre-commit, and changing only the new rows would break the table's visual consistency; a repo-wide MD038 cleanup would be its own change.
There was a problem hiding this comment.
@aallan, understood — the indented code-span style is the file's established sub-row convention, matching the pre-existing checker/ entries, there is no MD038 gate in CI, and a repo-wide cleanup would be a separate concern. Withdrawing the comment.
✏️ Learnings added
Learnt from: aallan
Repo: aallan/vera PR: 720
File: vera/README.md:115-117
Timestamp: 2026-06-11T00:03:34.481Z
Learning: In `vera/README.md`, the Module Map table uses leading-space code spans (e.g. `` ` core.py` ``) as the established visual convention for sub-module rows (indented entries under a package row). This is intentional and consistent with the pre-existing `checker/` section. Do not flag these as MD038 violations; there is no markdownlint gate in the repo's CI or pre-commit configuration, and changing only new rows would break visual consistency with existing ones.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: aallan
Repo: aallan/vera PR: 571
File: HISTORY.md:301-301
Timestamp: 2026-05-05T17:09:21.991Z
Learning: In the aallan/vera repo documentation, compute the canonical test count using pytest collection, not by line-counting source code. Use `pytest tests/ --collect-only -q` (or `python scripts/check_doc_counts.py`) as the source of truth. Do not verify “number of tests” by counting `def test_` / `class Test` lines with ripgrep (`rg`) or by shell-based line counts, because parametrized/fixture-driven tests can produce multiple collected test items from a single `def` line and this will undercount.
Learnt from: aallan
Repo: aallan/vera PR: 577
File: HISTORY.md:301-311
Timestamp: 2026-05-06T08:33:56.675Z
Learning: Do not trust pytest collection counts from the CodeRabbit sandbox when reporting test counts in Markdown docs (e.g., README.md, HISTORY.md, TESTING.md, ROADMAP.md). The sandbox lacks the full Python dependencies and yields inaccurate counts. Verify counts in the real development environment by running 'pytest tests/ --collect-only -q' or by using 'scripts/check_doc_counts.py', and base the documented numbers on that output. Apply this guidance broadly to Markdown docs in the repo, following the canonical approach referenced by learning 3b066ed3-0d66-41bd-a058-ccdf04052082.
Learnt from: aallan
Repo: aallan/vera PR: 613
File: ROADMAP.md:7-7
Timestamp: 2026-05-07T20:05:33.932Z
Learning: When reviewing Markdown documentation in aallan/vera that involves numeric count/summary verification (including tests, conformance programs, examples, and spec chapters), treat `python scripts/check_doc_counts.py` as the single authoritative source of truth. Do not use cross-references against `README.md` or other Markdown files to validate counts, since they may be temporarily stale during PR series. If you see a count discrepancy detected via `rg`/`grep` against `README.md`, treat it as non-actionable; only fail/flag the issue when `check_doc_counts.py` reports an error.
Learnt from: aallan
Repo: aallan/vera PR: 662
File: HISTORY.md:292-293
Timestamp: 2026-05-12T08:14:28.532Z
Learning: In aallan/vera, ensure the “tagged release” count stays perfectly in sync across exactly two locations per release: (1) the roll-up footer total in HISTORY.md (the “N tagged releases” line) and (2) the “N releases” figure in README.md (around line ~184 in the project-status sentence). ROADMAP.md intentionally does not include this counter. During PR review, flag changes that update only one of these two designated locations without updating the other to the same tagged-release count.
Learnt from: aallan
Repo: aallan/vera PR: 689
File: TESTING.md:0-0
Timestamp: 2026-05-19T11:05:07.302Z
Learning: When reviewing Vera documentation Markdown (e.g., TESTING.md), don’t update only the instance shown in the diff hunk for stale numeric count changes. Instead, treat the old count as a value that must be replaced everywhere in the same file: search the entire file for all literal occurrences of the old number (including table cells, overview/feature-coverage sections, round-trip sections, pre-commit/CI tables, and any other descriptive text) and ensure every occurrence is updated consistently as part of the same change. Use a whole-file search (e.g., grep-style patterns such as `34 example|all 34|All 34` or an equivalent literal-number match) to confirm no old-count references remain before commenting/approving.
Learnt from: aallan
Repo: aallan/vera PR: 689
File: CHANGELOG.md:0-0
Timestamp: 2026-05-19T12:34:26.776Z
Learning: When documenting `IO.read_char` behavior in the aallan/vera docs for Unix TTY inputs, explicitly state that it uses termios *cbreak* mode via `tty.setcbreak()` (so `ISIG` remains enabled and Ctrl-C works), and that it should *not* use raw mode. Also document that in the Unix TTY branch, Ctrl-D maps to EOF, while the non-TTY path reads a single byte using `sys.stdin.read(1)`.
Learnt from: aallan
Repo: aallan/vera PR: 697
File: ROADMAP.md:0-0
Timestamp: 2026-05-26T16:23:53.102Z
Learning: In Markdown documentation, do not accept citation/attribution URLs as valid based solely on plausible formatting. For external attribution links (e.g., tweets/X posts, blog posts, arXiv papers, GitHub permalinks, etc.), verify that the linked content is publicly accessible and matches the claimed author/date/title. If the source cannot be confirmed as accessible/accurate, flag the link as “unverifiable” and recommend replacing it with a confirmed URL or removing the URL while keeping the attribution text (e.g., “Szymon Teżewski (jasisz1)’s 24 May 2026 tweet”). Rule of thumb: verify or omit—never fabricate or assume.
Summary
Phase E — the final phase of the #222 plan:
vera/speculativeEdit, the one custom LSP method and the reason the obligation core was built first. Apply an edit in memory, re-verify on the warm incremental session, and return aproof_deltatelling an agent whether the edit keeps, breaks, or strengthens the program's proofs — before committing it. The latency-to-confidence loop from the issue's 2026-04-20 design notes, delivered.With this merge, all five phases of the plan have shipped across v0.0.161–v0.0.165:
ProofObligations + warm Z3 session (#716)vera lspstdio transport + coordinate layer (#718)vera/speculativeEditproof-delta (this PR)What's in it
vera/lsp/extensions.py:proof_delta— a pure set-difference over Phase A obligation identities (content_key), bucketing intonewly_discharged/newly_undischarged/timed_out/removed/unchanged, each item carrying fn/kind/expr/position and before/after status. The keeps/drops-Tier-1 signal deferred from Phase D lands here naturally:verified → tier3is a drop.speculative_editverifies a copy; parse/type errors reportok: falsewith the error count ("this edit doesn't even compile" is itself the answer an agent needs). Theremovedbucket is additive to the design notes' four — obligations deleted by the edit would otherwise vanish silently.vera/speculativeEditregistered; params are plain JSON{uri, text}(full-text, matching the sync model).Closing sweep (final-phase duties)
vera/README.md: the "No LSP server" row leaves Current Limitations; the module map gainsobligations/(637 lines) andlsp/(829 lines) with per-file rows.ROADMAP.md: LSP server #222 rows deleted from the agent-integration tier and the Phase 3a milestone (completed items move to HISTORY, per convention).uv.lock; site assets.Verification
tests/test_lsp.py45 → 52: identical-text → all-unchanged; breaking edit surfacesnewly_undischarged(the violatednat_sub) with canonical state untouched; strengthening edit surfacesnewly_discharged; parse/type errors →ok: false; deleted function →removed;proof_deltapurity.mypy vera/strict clean (69 files); ruff S clean; LSP + obligations suites green; full non-stress suite green via pre-commit; doc counts consistent (4,272 tests); version sync 6/6.Closes #222
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests