Windows-strict CI: cp1252 fix + drop advisory + bump to v0.0.143#646
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughVersion 0.0.143 is released: parser now loads grammar.lark with explicit UTF‑8, CI test job sets Changesv0.0.143 Release: Windows Encoding, CI Strictness & Test Fixes
Sequence Diagram(s)sequenceDiagram
participant CI
participant TestRunner
participant VeraParser
participant Filesystem
CI->>TestRunner: start test job (PYTHONUTF8=1)
TestRunner->>VeraParser: import -> _get_parser()
VeraParser->>Filesystem: read_text("grammar.lark", encoding="utf-8")
Filesystem-->>VeraParser: UTF-8 grammar
TestRunner->>TestRunner: run tests (tempfile closed before subprocess)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #646 +/- ##
=======================================
Coverage 90.97% 90.97%
=======================================
Files 59 59
Lines 23137 23137
Branches 259 259
=======================================
Hits 21048 21048
Misses 2082 2082
Partials 7 7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@HISTORY.md`:
- Line 331: Update the top header count in HISTORY.md to match the corrected
total of active development days: change the header line that currently reads
"56" to "57" so it matches the body line "Total: **810+ commits, 143 tagged
releases, 57 active development days.**"; locate and edit the header/title text
(the first few lines of HISTORY.md) to replace the outdated day-count with 57.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 55fb920e-8581-46c3-94cd-e58eb8ed9a20
⛔ Files ignored due to path filters (6)
docs/index.htmlis excluded by!docs/**docs/index.mdis excluded by!docs/**docs/llms-full.txtis excluded by!docs/**docs/llms.txtis excluded by!docs/**docs/sitemap.xmlis excluded by!docs/**uv.lockis excluded by!**/*.lock,!uv.lock
📒 Files selected for processing (8)
.github/workflows/ci.ymlCHANGELOG.mdHISTORY.mdKNOWN_ISSUES.mdREADME.mdpyproject.tomlvera/__init__.pyvera/parser.py
Closes the third sub-issue from PR #639's matrix-expansion follow-up cluster (#641 cp1252 file I/O encoding) and flips the Windows matrix entries from advisory to strict. This is the "now works on Windows" release. Set PYTHONUTF8=1 in the CI test job environment so Python's text-mode open() defaults to UTF-8 regardless of locale (PEP 540). Covers all CI-side encoding failures with a single config change rather than auditing ~30 individual call sites. Closes ~9 failing tests across test_codegen.py, test_codegen_monomorphize.py, test_codegen_closures.py, test_html.py. Also added explicit encoding='utf-8' to vera/parser.py's grammar load — the load-bearing site that runs on every parse, so users on Windows without PYTHONUTF8=1 in their shell still get correct grammar loading (defense-in-depth). Filed #645 for the broader audit (every text-mode open() / read_text() / write_text() call site for explicit encoding) as the durable fix. Drop continue-on-error advisory mode PR #639 added Windows entries with continue-on-error: ${{ matrix.os == 'windows-latest' }} so failures didn't block merges while #640, sub-issues now closed (PR for #642 sanitises Windows paths in test fixtures; PR for #640 reads sys.stdin cross-platform; this PR closes #641). The continue-on-error line is removed in this release; the matrix is now fully strict across all 9 entries (ubuntu / macos / windows × 3.11 / 3.12 / 3.13). Version bump 0.0.142 → 0.0.143 This is the Windows-compatibility milestone — the project went from "untested on Windows" to "Windows is a strict gate" in three sequential PRs (#643 / #644 / this). Marking with a minor version bump and "now works on Windows" framing in CHANGELOG. Co-Authored-By: Claude <noreply@anthropic.invalid>
eb0fbc8 to
676606f
Compare
Header line 3 said 56 active development days; footer line 331 already correctly read 57 after v0.0.143 added a day. Header was the stale half. Trivial single-word fix. Co-Authored-By: Claude <noreply@anthropic.invalid>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@HISTORY.md`:
- Line 290: Update the v0.0.143 entry in HISTORY.md (the line containing
"v0.0.143 | 10 May") to remove the incorrect issue reference `#641` from the list
of closed issues so it only lists `#640` and `#642`; keep the rest of the text
intact (including the notes about Windows CI, `continue-on-error` advisory, and
the matrix) or remove the `#641` citation entirely if you prefer.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6c514633-6d12-40c6-8f0c-fb8fa8658704
📒 Files selected for processing (1)
HISTORY.md
`tests/test_html.py::test_all_vera_blocks_check` and `test_all_vera_blocks_verify` used `tempfile.NamedTemporaryFile(..., delete=True)` then ran `vera <subcmd> f.name` via subprocess while the `with` block held the file handle open. On Windows you can't open a file twice while one handle is still held; the subprocess hit a Python traceback (truncated to "Traceback (most recent call last):" by the test's first-line error capture). This is the classic Windows `NamedTemporaryFile(delete=True)` gotcha — Unix allows concurrent file handles, Windows doesn't. All other test fixtures in the suite already use `delete=False` and clean up explicitly; only test_html.py had the legacy `delete=True` pattern. Fix: convert both sites to `delete=False` + manual `Path.unlink()` in a `finally` block. Closes the last 2 Windows-only test failures on PR #646; matrix should now go fully green across all 9 strict entries. Co-Authored-By: Claude <noreply@anthropic.invalid>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/test_html.py`:
- Around line 112-114: tempfile.NamedTemporaryFile is invoked twice in
tests/test_html.py (the NamedTemporaryFile call that assigns to f around lines
where mode="w", suffix=".vera", dir=str(ROOT), delete=False, and the second
similar call later) without an explicit encoding, which can cause non-ASCII
writes to fail on some locales; update both calls to include encoding='utf-8' so
text-mode writes use UTF-8 (i.e., add encoding='utf-8' to the
NamedTemporaryFile(...) arguments for the NamedTemporaryFile that creates the
Vera temp files).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1fde7034-e146-43a7-9026-94865157c15f
📒 Files selected for processing (2)
TESTING.mdtests/test_html.py
The post-#637 Windows CI rollout (PRs #639/#643/#644/#646) hit three test-fixture footguns — `tempfile.NamedTemporaryFile`'s delete-while-open behaviour, Windows paths embedded in Vera string literals tripping the `\U` escape, and cp1252 default file I/O encoding. Each had a portable workaround. Added a "Test Fixture Conventions" section to TESTING.md (immediately before "Adding Tests") with the three rules + their canonical wrong/right code snippets. Each rule cites the test that surfaced the bug + the PR that fixed it. Added a brief "Cross-platform pitfalls" section to CLAUDE.md's project-orientation footer that points at the TESTING.md section. The CLAUDE.md entry is a one-bullet pointer per rule, not a full repeat — keeping the substantive content in TESTING.md as the single source of truth. These notes will save the next contributor (or agent) from re-discovering the same footguns. Co-Authored-By: Claude <noreply@anthropic.invalid>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@CLAUDE.md`:
- Around line 183-185: The doc incorrectly suggests using
tmp_path.replace(os.sep, "/") on a pathlib.Path; replace any occurrences or
examples that call tmp_path.replace(...) with tmp_path.as_posix() so the pytest
tmp_path Path is converted to a POSIX string properly (e.g., where the guidance
talks about "Paths embedded into Vera string literals" use tmp_path.as_posix());
update the wording/examples around tmp_path to show .as_posix() and ensure any
surrounding explanation mentions converting the Path to a string for Vera
literals.
In `@TESTING.md`:
- Around line 337-338: Update TESTING.md so its Overview and CI Pipeline matrix
rows reflect that Windows is now part of the active CI matrix: find the phrases
"6 combinations (Ubuntu/macOS)" and the Overview/CI Pipeline rows that enumerate
platform combinations and change them to include Windows (and adjust the total
count accordingly), and update the section text that claims only Ubuntu/macOS to
state the current active matrix includes Windows as well; ensure the matrix
labels and counts are consistent with the new "Windows rollout" paragraph added
earlier in the file.
- Around line 371-372: The example uses tmp_path.replace(os.sep, "/") which
calls pathlib.Path.replace (wrong signature) and raises a TypeError; change the
conversion to POSIX to use the Path method tmp_path.as_posix() instead so
vera_path is a POSIX string (used when building source =
f'IO.read_file("{vera_path}")'); update the reference of vera_path to be
tmp_path.as_posix() everywhere.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 23b46a03-beb7-4be6-be3c-94ed07627c08
📒 Files selected for processing (2)
CLAUDE.mdTESTING.md
…deRabbit)
Both `NamedTemporaryFile` calls in test_html.py omitted the
`encoding=` kwarg. Per the convention I just added to TESTING.md
("File I/O without explicit encoding falls back to the locale
default"), the explicit form is the durable fix — CI sets
PYTHONUTF8=1 as a backstop, but a local-developer run on
Windows without that env var would hit cp1252 fallbacks if the
HTML code blocks ever contain non-ASCII characters.
Trivial — and consistent with the rule the same PR documents.
Co-Authored-By: Claude <noreply@anthropic.invalid>
Three findings on PR #646's docs additions: 1. CLAUDE.md and TESTING.md examples used `tmp_path.replace(os.sep, "/")` to get POSIX form. Works on str, but BROKEN on pathlib.Path (Path.replace is the rename method — different signature, would silently move the file). Variable name `tmp_path` evokes pytest's Path-typed fixture. Switched to `Path(tmp_path).as_posix()` which works for both str and Path. The actual test code at tests/test_codegen.py:7760 / 7787 still uses str.replace because the variables there are str (from `tempfile.NamedTemporaryFile().name` and `os.path.join`) — that's correct in context, no change needed. 2. TESTING.md Overview row line 18 said "6 combinations (Ubuntu/macOS)" — pre-Windows. CI Pipeline section line 484 said the same. Both updated to "9 combinations (Ubuntu/macOS/Windows)". Same shape as the README test-count drift caught in PR #631 — matrix-related docs across multiple files don't auto-sync. Worth flagging the doc-counts script's coverage gap as a follow-up if it recurs. Co-Authored-By: Claude <noreply@anthropic.invalid>
Summary
Closes #641. The "now works on Windows" release — closes the third and final sub-issue from PR #639's matrix-expansion follow-up cluster, drops the
continue-on-erroradvisory, and bumps to v0.0.143.What changes
Three things in one PR — appropriate for the milestone framing (Windows goes from untested to strict gate):
Default cp1252 encoding causes Windows test failures — audit file I/O for explicit UTF-8 #641 fix. Set
PYTHONUTF8=1in the CI test job env (PEP 540 — Python's text-modeopen()defaults to UTF-8 regardless of locale). Closes ~9 failing tests acrosstest_codegen.py,test_codegen_monomorphize.py,test_codegen_closures.py,test_html.py(all'charmap' codec can't encode '→'/'utf-8' can't decode 0x97flavours of the same root cause). Also added explicitencoding='utf-8'tovera/parser.py's grammar load — the load-bearing site that runs on every parse.Drop
continue-on-erroradvisory. PR Add windows-latest to CI test matrix in advisory mode (closes #637) #639 added Windows entries withcontinue-on-error: ${{ matrix.os == 'windows-latest' }}so failures didn't block merges while sub-issues were being fixed. All three sub-issues (Vera CLI's /dev/stdin is Unix-only — add Windows-compatible stdin handling #640 stdin via PR Read stdin via sys.stdin for cross-platform support (closes #640) #644, Default cp1252 encoding causes Windows test failures — audit file I/O for explicit UTF-8 #641 cp1252 here, test_io_read_file_roundtrip: Windows path embedded in Vera string literal triggers \U escape #642 path escapes via PR Sanitise Windows paths in IO test fixtures (closes #642) #643) now closed. Matrix is now fully strict across all 9 entries.Version bump 0.0.142 → 0.0.143. This is the Windows-compatibility milestone — the project went from "untested on Windows" to "Windows is a strict gate" in three sequential PRs.
Why Option B (PYTHONUTF8=1) instead of Option A (audit each call site)
Per the original #641 issue body, two options were on the table:
encoding='utf-8'at every text-mode call site (~30 sites acrossvera/,scripts/,tests/)PYTHONUTF8=1in CI as the stopgapThis PR goes with B for the CI fix because it's a single-line change vs. ~30 mechanical edits, and the user asked to "burn through" the cluster. #645 is filed to track Option A as the durable follow-up — explicit per-site encoding is what protects users running on Windows without
PYTHONUTF8=1in their shell environment.Test plan
Sibling PRs
\Uescape in test fixtures (addresses test_io_read_file_roundtrip: Windows path embedded in Vera string literal triggers \U escape #642)/dev/stdincross-platform (addresses Vera CLI's /dev/stdin is Unix-only — add Windows-compatible stdin handling #640)This PR depends on the other two having landed (or at least having their fixes in main) so the matrix can be flipped to strict cleanly. If those land first and CI is green, this one's clean.
🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests
Chores
Documentation