Skip to content

feat: vera/speculativeEdit proof-delta — completes the LSP server (#222 Phase E)#720

Merged
aallan merged 1 commit into
mainfrom
feat/222-phase-e-speculative-edit
Jun 11, 2026
Merged

feat: vera/speculativeEdit proof-delta — completes the LSP server (#222 Phase E)#720
aallan merged 1 commit into
mainfrom
feat/222-phase-e-speculative-edit

Conversation

@aallan

@aallan aallan commented Jun 10, 2026

Copy link
Copy Markdown
Owner

Summary

Phase E — the final phase of the #222 plan: vera/speculativeEdit, the one custom LSP method and the reason the obligation core was built first. Apply an edit in memory, re-verify on the warm incremental session, and return a proof_delta telling an agent whether the edit keeps, breaks, or strengthens the program's proofs — before committing it. The latency-to-confidence loop from the issue's 2026-04-20 design notes, delivered.

With this merge, all five phases of the plan have shipped across v0.0.161–v0.0.165:

Phase Release What
A v0.0.161 reified ProofObligations + warm Z3 session (#716)
B v0.0.162 incremental invalidation + discharge cache (#717)
C v0.0.163 vera lsp stdio transport + coordinate layer (#718)
D v0.0.164 diagnostics + tier hints, hover, slot goto, hole completion (#719)
E v0.0.165 vera/speculativeEdit proof-delta (this PR)

What's in it

  • vera/lsp/extensions.py: proof_delta — a pure set-difference over Phase A obligation identities (content_key), bucketing into newly_discharged / newly_undischarged / timed_out / removed / unchanged, each item carrying fn/kind/expr/position and before/after status. The keeps/drops-Tier-1 signal deferred from Phase D lands here naturally: verified → tier3 is a drop. speculative_edit verifies a copy; parse/type errors report ok: false with the error count ("this edit doesn't even compile" is itself the answer an agent needs). The removed bucket is additive to the design notes' four — obligations deleted by the edit would otherwise vanish silently.
  • Isolation: speculative runs share the warm session and discharge cache (deliberate pre-warming — a speculative state that later becomes real replays from cache) under the same lock, but never touch the per-URI analyses or published diagnostics. Pinned by a test that re-verifies the original source after a speculative edit and observes full cache replay.
  • Server wiring: vera/speculativeEdit registered; params are plain JSON {uri, text} (full-text, matching the sync model).

Closing sweep (final-phase duties)

  • vera/README.md: the "No LSP server" row leaves Current Limitations; the module map gains obligations/ (637 lines) and lsp/ (829 lines) with per-file rows.
  • ROADMAP.md: LSP server #222 rows deleted from the agent-integration tier and the Phase 3a milestone (completed items move to HISTORY, per convention).
  • CHANGELOG 0.0.165 summarises the full A–E arc; HISTORY.md one-liner; version bump across all tracked sites; uv.lock; site assets.

Verification

  • tests/test_lsp.py 45 → 52: identical-text → all-unchanged; breaking edit surfaces newly_undischarged (the violated nat_sub) with canonical state untouched; strengthening edit surfaces newly_discharged; parse/type errors → ok: false; deleted function → removed; proof_delta purity.
  • mypy vera/ strict clean (69 files); ruff S clean; LSP + obligations suites green; full non-stress suite green via pre-commit; doc counts consistent (4,272 tests); version sync 6/6.

Closes #222

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added speculative edit capability to the LSP server, allowing analysis of proposed source changes.
    • Integrated proof-delta tracking to report obligation status changes during speculative edits.
  • Tests

    • Expanded test suite with comprehensive coverage for speculative edit scenarios and proof-delta operations.

… Phase E)

The one custom LSP method, and the payoff of building the obligation
core first: apply an edit in memory, re-verify on the warm incremental
session, and return a proof_delta (newly_discharged /
newly_undischarged / timed_out / removed / unchanged) computed as a
set-difference over Phase A obligation identities. An agent proposing
an edit learns whether it keeps, breaks, or strengthens the program's
proofs before committing it — the latency-to-confidence loop from the
#222 design notes, and the keeps/drops-Tier-1 signal deferred from
Phase D lands here naturally (verified -> tier3 is a drop).

vera/lsp/extensions.py: proof_delta (pure set-difference) +
speculative_edit (verify a copy; parse/type errors report ok:false
with the error count — 'this edit doesn't even compile' is itself the
answer). Speculative runs share the warm session and discharge cache
(pre-warming: a speculative state that becomes real replays from
cache) under the same lock, but never touch per-URI analyses or
published diagnostics — canonical editor state is untouched, pinned by
a test that re-verifies the original source afterwards and observes
full cache replay.

server.py: vera/speculativeEdit registered; params accepted as plain
JSON ({uri, text}).

Closing sweep for the #222 plan (this is the final phase):
- vera/README.md: 'No LSP server' row deleted from Current
  Limitations; obligations/ and lsp/ packages added to the module map.
- ROADMAP.md: #222 rows deleted from the agent-integration tier and
  the Phase 3a milestone (completed items move to HISTORY, per
  convention).
- HISTORY.md v0.0.165 row; CHANGELOG 0.0.165 cut summarising the full
  A-E arc (v0.0.161-v0.0.165); version bump across all tracked sites;
  uv.lock; site assets.

Tests (tests/test_lsp.py 45 -> 52): identical-text all-unchanged;
breaking edit surfaces newly_undischarged (violated nat_sub) with
canonical state untouched; strengthening edit surfaces
newly_discharged; parse/type errors report ok:false; deleted functions
report removed; proof_delta purity.

Validation: mypy strict clean (69 files), ruff S clean, LSP +
obligations suites green, doc counts consistent (4,272), version sync
6/6, site assets current.

Part of #222 (Phase E — the PR carries the closing footer).

Co-Authored-By: Claude <noreply@anthropic.invalid>
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Phase E adds speculative edit verification to the Vera LSP server. A new vera/lsp/extensions.py module implements proof-delta computation—comparing baseline and speculative obligation sets by content-key identity to categorise newly discharged, newly undischarged, unchanged, timed-out, and removed obligations. The LSP handler wires this into create_server(), and comprehensive tests validate all branches: identical edits, precondition weakening/strengthening, parse/type errors, function deletion, and proof-delta purity semantics.

Changes

Speculative Edit Feature with Proof-Delta

Layer / File(s) Summary
Proof-delta and speculative-edit logic
vera/lsp/extensions.py
Obligation set-difference via proof_delta() using content-key identity; _item() serialisation helper; speculative_edit() invoking session.verify_source(), handling parse/transform/type-error exceptions, and returning structured results with proof-delta and diagnostic counts.
LSP server handler integration
vera/lsp/server.py
Imports speculative_edit and registers the vera/speculativeEdit method handler within create_server(), extracting request parameters, loading baseline from cached analysis, invoking the extension under analysis_lock, and returning results.
Comprehensive Phase E test suite
tests/test_lsp.py
TestSpeculativeEdit with _baseline() helper; tests for unchanged obligations, newly undischarged (weakened preconditions), newly discharged (strengthened preconditions), parse/type errors, function deletion, and proof-delta set-difference semantics over identity keys.
Version and documentation updates
pyproject.toml, vera/__init__.py, CHANGELOG.md, HISTORY.md, README.md, ROADMAP.md, TESTING.md, vera/README.md
Bump to v0.0.165; document Phase E and proof-delta in changelog; update release history (Stage 12, 165 tagged releases, 62 active development days); increment test counts to 4,272 total and 4,241 passed; remove completed "LSP server" roadmap entries ([#222]); expand Module Map to document obligations and lsp subsystems; update TESTING.md coverage for test_lsp.py scope.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

The feature introduces two new export functions with non-trivial logic (proof-delta set-difference, speculative-edit error handling), one server integration point, and a solid test suite covering multiple scenarios. Mixed complexity across obligation identity tracking, exception handling, and test orchestration; straightforward LSP wiring.

Possibly related PRs

  • aallan/vera#718: Builds on Phase C LSP server scaffolding and test structure; extends create_server() feature-handler wiring and enlarges tests/test_lsp.py with Phase E speculative-edit coverage.

Suggested labels

compiler, tests, docs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 46.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: implementing the vera/speculativeEdit proof-delta feature that completes Phase E of the LSP server.
Linked Issues check ✅ Passed Phase E (speculative edit with proof-delta) is the final piece completing LSP server (#222). All five feature objectives—Diagnostics, Completion, Go-to-definition, Hover, Find references—are either pre-existing or scaffolded here.
Out of Scope Changes check ✅ Passed All changes directly support Phase E completion: speculative_edit and proof_delta implementations, LSP server wiring, test coverage, version bumps, and documentation (CHANGELOG, HISTORY, README, ROADMAP, TESTING) reflecting v0.0.165 release.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/222-phase-e-speculative-edit

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 85.41667% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.95%. Comparing base (5324e8c) to head (6520ac6).

Files with missing lines Patch % Lines
vera/lsp/server.py 33.33% 6 Missing ⚠️
vera/lsp/extensions.py 97.29% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #720      +/-   ##
==========================================
- Coverage   90.96%   90.95%   -0.01%     
==========================================
  Files          68       69       +1     
  Lines       24630    24676      +46     
  Branches      292      292              
==========================================
+ Hits        22404    22444      +40     
- Misses       2219     2225       +6     
  Partials        7        7              
Flag Coverage Δ
javascript 61.40% <ø> (ø)
python 94.62% <85.41%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@vera/README.md`:
- Around line 115-117: The markdown rows include inline code spans with
leading/trailing spaces (e.g. "`  core.py`", "`  cache.py`", "`  session.py`")
which trigger MD038; remove the inner padding so the spans are backticked
without spaces (e.g. change "`  core.py`" to "`core.py`") and apply the same
change to the other affected rows (including the additional occurrences
mentioned around lines 119-123) to satisfy markdownlint.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f3da4af4-7f7b-47d9-a9b3-df4b125e9381

📥 Commits

Reviewing files that changed from the base of the PR and between 5324e8c and 6520ac6.

⛔ Files ignored due to path filters (6)
  • docs/index.html is excluded by !docs/**
  • docs/index.md is excluded by !docs/**
  • docs/llms-full.txt is excluded by !docs/**
  • docs/llms.txt is excluded by !docs/**
  • docs/sitemap.xml is excluded by !docs/**
  • uv.lock is excluded by !**/*.lock, !uv.lock
📒 Files selected for processing (11)
  • CHANGELOG.md
  • HISTORY.md
  • README.md
  • ROADMAP.md
  • TESTING.md
  • pyproject.toml
  • tests/test_lsp.py
  • vera/README.md
  • vera/__init__.py
  • vera/lsp/extensions.py
  • vera/lsp/server.py

Comment thread vera/README.md
Comment on lines +115 to +117
| ` core.py` | 109 | | ProofObligation record: identity (content_key) + discharge outcome | |
| ` cache.py` | 219 | | Invalidation keys (structural/callee/context hashes), DischargeCache | |
| ` session.py` | 252 | | Warm-Z3 daemon: per-function replay vs re-verify in declaration order | |

@coderabbitai coderabbitai Bot Jun 10, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove inner padding in inline code spans to satisfy markdownlint MD038.

These rows use code spans with leading/trailing spaces (for example ` core.py`), which triggers MD038 and can fail lint gates.

Suggested fix
-| `  core.py` | 109 | | ProofObligation record: identity (content_key) + discharge outcome | |
-| `  cache.py` | 219 | | Invalidation keys (structural/callee/context hashes), DischargeCache | |
-| `  session.py` | 252 | | Warm-Z3 daemon: per-function replay vs re-verify in declaration order | |
+| `core.py` | 109 | | ProofObligation record: identity (content_key) + discharge outcome | |
+| `cache.py` | 219 | | Invalidation keys (structural/callee/context hashes), DischargeCache | |
+| `session.py` | 252 | | Warm-Z3 daemon: per-function replay vs re-verify in declaration order | |
-| `  convert.py` | 144 | | Span/SourceLocation/LSP coordinate conversions, UTF-16 transcoding | |
-| `  documents.py` | 69 | | URI-keyed document store, full-text sync | |
-| `  features.py` | 292 | | Diagnostics + tier hints, hover, slot goto, hole completion | |
-| `  extensions.py` | 146 | | vera/speculativeEdit proof-delta | |
-| `  server.py` | 169 | | pygls wiring, single-session serialisation | |
+| `convert.py` | 144 | | Span/SourceLocation/LSP coordinate conversions, UTF-16 transcoding | |
+| `documents.py` | 69 | | URI-keyed document store, full-text sync | |
+| `features.py` | 292 | | Diagnostics + tier hints, hover, slot goto, hole completion | |
+| `extensions.py` | 146 | | vera/speculativeEdit proof-delta | |
+| `server.py` | 169 | | pygls wiring, single-session serialisation | |

Also applies to: 119-123

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 115-115: Spaces inside code span elements

(MD038, no-space-in-code)


[warning] 116-116: Spaces inside code span elements

(MD038, no-space-in-code)


[warning] 117-117: Spaces inside code span elements

(MD038, no-space-in-code)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@vera/README.md` around lines 115 - 117, The markdown rows include inline code
spans with leading/trailing spaces (e.g. "`  core.py`", "`  cache.py`", "` 
session.py`") which trigger MD038; remove the inner padding so the spans are
backticked without spaces (e.g. change "`  core.py`" to "`core.py`") and apply
the same change to the other affected rows (including the additional occurrences
mentioned around lines 119-123) to satisfy markdownlint.

Source: Linters/SAST tools

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai Declining this one: the leading-space code-span style is the file's established sub-row convention — the pre-existing checker/ section (lines 84-90, untouched by this PR) uses exactly core.py etc., so the new obligations/ and lsp/ rows deliberately match it. There is no markdownlint gate in this repo's CI or pre-commit, and changing only the new rows would break the table's visual consistency; a repo-wide MD038 cleanup would be its own change.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aallan, understood — the indented code-span style is the file's established sub-row convention, matching the pre-existing checker/ entries, there is no MD038 gate in CI, and a repo-wide cleanup would be a separate concern. Withdrawing the comment.


✏️ Learnings added
Learnt from: aallan
Repo: aallan/vera PR: 720
File: vera/README.md:115-117
Timestamp: 2026-06-11T00:03:34.481Z
Learning: In `vera/README.md`, the Module Map table uses leading-space code spans (e.g. `` `  core.py` ``) as the established visual convention for sub-module rows (indented entries under a package row). This is intentional and consistent with the pre-existing `checker/` section. Do not flag these as MD038 violations; there is no markdownlint gate in the repo's CI or pre-commit configuration, and changing only new rows would break visual consistency with existing ones.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: aallan
Repo: aallan/vera PR: 571
File: HISTORY.md:301-301
Timestamp: 2026-05-05T17:09:21.991Z
Learning: In the aallan/vera repo documentation, compute the canonical test count using pytest collection, not by line-counting source code. Use `pytest tests/ --collect-only -q` (or `python scripts/check_doc_counts.py`) as the source of truth. Do not verify “number of tests” by counting `def test_` / `class Test` lines with ripgrep (`rg`) or by shell-based line counts, because parametrized/fixture-driven tests can produce multiple collected test items from a single `def` line and this will undercount.

Learnt from: aallan
Repo: aallan/vera PR: 577
File: HISTORY.md:301-311
Timestamp: 2026-05-06T08:33:56.675Z
Learning: Do not trust pytest collection counts from the CodeRabbit sandbox when reporting test counts in Markdown docs (e.g., README.md, HISTORY.md, TESTING.md, ROADMAP.md). The sandbox lacks the full Python dependencies and yields inaccurate counts. Verify counts in the real development environment by running 'pytest tests/ --collect-only -q' or by using 'scripts/check_doc_counts.py', and base the documented numbers on that output. Apply this guidance broadly to Markdown docs in the repo, following the canonical approach referenced by learning 3b066ed3-0d66-41bd-a058-ccdf04052082.

Learnt from: aallan
Repo: aallan/vera PR: 613
File: ROADMAP.md:7-7
Timestamp: 2026-05-07T20:05:33.932Z
Learning: When reviewing Markdown documentation in aallan/vera that involves numeric count/summary verification (including tests, conformance programs, examples, and spec chapters), treat `python scripts/check_doc_counts.py` as the single authoritative source of truth. Do not use cross-references against `README.md` or other Markdown files to validate counts, since they may be temporarily stale during PR series. If you see a count discrepancy detected via `rg`/`grep` against `README.md`, treat it as non-actionable; only fail/flag the issue when `check_doc_counts.py` reports an error.

Learnt from: aallan
Repo: aallan/vera PR: 662
File: HISTORY.md:292-293
Timestamp: 2026-05-12T08:14:28.532Z
Learning: In aallan/vera, ensure the “tagged release” count stays perfectly in sync across exactly two locations per release: (1) the roll-up footer total in HISTORY.md (the “N tagged releases” line) and (2) the “N releases” figure in README.md (around line ~184 in the project-status sentence). ROADMAP.md intentionally does not include this counter. During PR review, flag changes that update only one of these two designated locations without updating the other to the same tagged-release count.

Learnt from: aallan
Repo: aallan/vera PR: 689
File: TESTING.md:0-0
Timestamp: 2026-05-19T11:05:07.302Z
Learning: When reviewing Vera documentation Markdown (e.g., TESTING.md), don’t update only the instance shown in the diff hunk for stale numeric count changes. Instead, treat the old count as a value that must be replaced everywhere in the same file: search the entire file for all literal occurrences of the old number (including table cells, overview/feature-coverage sections, round-trip sections, pre-commit/CI tables, and any other descriptive text) and ensure every occurrence is updated consistently as part of the same change. Use a whole-file search (e.g., grep-style patterns such as `34 example|all 34|All 34` or an equivalent literal-number match) to confirm no old-count references remain before commenting/approving.

Learnt from: aallan
Repo: aallan/vera PR: 689
File: CHANGELOG.md:0-0
Timestamp: 2026-05-19T12:34:26.776Z
Learning: When documenting `IO.read_char` behavior in the aallan/vera docs for Unix TTY inputs, explicitly state that it uses termios *cbreak* mode via `tty.setcbreak()` (so `ISIG` remains enabled and Ctrl-C works), and that it should *not* use raw mode. Also document that in the Unix TTY branch, Ctrl-D maps to EOF, while the non-TTY path reads a single byte using `sys.stdin.read(1)`.

Learnt from: aallan
Repo: aallan/vera PR: 697
File: ROADMAP.md:0-0
Timestamp: 2026-05-26T16:23:53.102Z
Learning: In Markdown documentation, do not accept citation/attribution URLs as valid based solely on plausible formatting. For external attribution links (e.g., tweets/X posts, blog posts, arXiv papers, GitHub permalinks, etc.), verify that the linked content is publicly accessible and matches the claimed author/date/title. If the source cannot be confirmed as accessible/accurate, flag the link as “unverifiable” and recommend replacing it with a confirmed URL or removing the URL while keeping the attribution text (e.g., “Szymon Teżewski (jasisz1)’s 24 May 2026 tweet”). Rule of thumb: verify or omit—never fabricate or assume.

@aallan aallan merged commit d2e11ae into main Jun 11, 2026
27 checks passed
@aallan aallan deleted the feat/222-phase-e-speculative-edit branch June 11, 2026 00:19
@aallan aallan mentioned this pull request Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LSP server

1 participant