Windows-strict CI: cp1252 fix + drop advisory + bump to v0.0.143 by aallan · Pull Request #646 · aallan/vera

aallan · 2026-05-10T22:19:45Z

Summary

Closes #641. The "now works on Windows" release — closes the third and final sub-issue from PR #639's matrix-expansion follow-up cluster, drops the continue-on-error advisory, and bumps to v0.0.143.

What changes

Three things in one PR — appropriate for the milestone framing (Windows goes from untested to strict gate):

Default cp1252 encoding causes Windows test failures — audit file I/O for explicit UTF-8 #641 fix. Set PYTHONUTF8=1 in the CI test job env (PEP 540 — Python's text-mode open() defaults to UTF-8 regardless of locale). Closes ~9 failing tests across test_codegen.py, test_codegen_monomorphize.py, test_codegen_closures.py, test_html.py (all 'charmap' codec can't encode '→' / 'utf-8' can't decode 0x97 flavours of the same root cause). Also added explicit encoding='utf-8' to vera/parser.py's grammar load — the load-bearing site that runs on every parse.
Drop continue-on-error advisory. PR Add windows-latest to CI test matrix in advisory mode (closes #637) #639 added Windows entries with continue-on-error: ${{ matrix.os == 'windows-latest' }} so failures didn't block merges while sub-issues were being fixed. All three sub-issues (Vera CLI's /dev/stdin is Unix-only — add Windows-compatible stdin handling #640 stdin via PR Read stdin via sys.stdin for cross-platform support (closes #640) #644, Default cp1252 encoding causes Windows test failures — audit file I/O for explicit UTF-8 #641 cp1252 here, test_io_read_file_roundtrip: Windows path embedded in Vera string literal triggers \U escape #642 path escapes via PR Sanitise Windows paths in IO test fixtures (closes #642) #643) now closed. Matrix is now fully strict across all 9 entries.
Version bump 0.0.142 → 0.0.143. This is the Windows-compatibility milestone — the project went from "untested on Windows" to "Windows is a strict gate" in three sequential PRs.

Why Option B (PYTHONUTF8=1) instead of Option A (audit each call site)

Per the original #641 issue body, two options were on the table:

A: explicit encoding='utf-8' at every text-mode call site (~30 sites across vera/, scripts/, tests/)
B: PYTHONUTF8=1 in CI as the stopgap

This PR goes with B for the CI fix because it's a single-line change vs. ~30 mechanical edits, and the user asked to "burn through" the cluster. #645 is filed to track Option A as the durable follow-up — explicit per-site encoding is what protects users running on Windows without PYTHONUTF8=1 in their shell environment.

Test plan

All 3,782 tests pass on macOS (existing behaviour preserved)
mypy clean
Doc counts consistent
Version sync clean (6 files at 0.0.143)
CHANGELOG entry follows Keep-a-Changelog format with comparison link
HISTORY.md row appended for v0.0.143 (single-sentence Stage 12 style)
KNOWN_ISSUES.md updated (Add windows-latest to CI test matrix (3 runners mirroring Ubuntu / macOS) #637/Vera CLI's /dev/stdin is Unix-only — add Windows-compatible stdin handling #640/Default cp1252 encoding causes Windows test failures — audit file I/O for explicit UTF-8 #641/test_io_read_file_roundtrip: Windows path embedded in Vera string literal triggers \U escape #642 removed; Audit text-mode open()/read_text()/write_text() for explicit encoding='utf-8' (#641 follow-up) #645 added)
All 9 matrix entries pass on this PR's CI run (the canonical validation — Windows entries now strict)
After merge: tag v0.0.143 and publish release

Sibling PRs

#643 — \U escape in test fixtures (addresses test_io_read_file_roundtrip: Windows path embedded in Vera string literal triggers \U escape #642)
#644 — /dev/stdin cross-platform (addresses Vera CLI's /dev/stdin is Unix-only — add Windows-compatible stdin handling #640)
This PR — cp1252 + drop advisory + version bump (addresses Default cp1252 encoding causes Windows test failures — audit file I/O for explicit UTF-8 #641)

This PR depends on the other two having landed (or at least having their fixes in main) so the matrix can be flipped to strict cleanly. If those land first and CI is green, this one's clean.

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Improved stdin handling and enforced UTF‑8 text processing on Windows for more reliable test/run behaviour
- Normalised Windows temporary paths in tests to avoid escape-related failures
Tests
- Updated temporary-file handling in tests to reliably run CLI checks across platforms
Chores
- Bumped project version to v0.0.143
- Windows tests no longer run in advisory mode and now block merges alongside other platforms
Documentation
- Release notes, changelog and history updated for v0.0.143

coderabbitai · 2026-05-10T22:19:55Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Version 0.0.143 is released: parser now loads grammar.lark with explicit UTF‑8, CI test job sets PYTHONUTF8=1 and no longer treats Windows results as advisory, tests adjust tempfile lifecycles for subprocess compatibility, and version/docs/changelog are updated.

Changes

v0.0.143 Release: Windows Encoding, CI Strictness & Test Fixes

Layer / File(s)	Summary
Parser UTF-8 Grammar Loading `vera/parser.py`	`_get_parser()` now reads `grammar.lark` with `read_text(encoding="utf-8")`; Lark configuration unchanged.
CI Workflow `.github/workflows/ci.yml`	`jobs.test` adds job-level `env: PYTHONUTF8: 1` and removes `continue-on-error: ${{ matrix.os == 'windows-latest' }}` so Windows failures fail the job.
Test lifecycle changes `tests/test_html.py`, `TESTING.md`	`test_all_vera_blocks_check` and `test_all_vera_blocks_verify` now use `NamedTemporaryFile(..., delete=False)`, close the handle before invoking subprocesses, and unlink the file in `finally`; `TESTING.md` test-line count and fixture conventions updated.
Documentation & Local Guidance `CLAUDE.md`, `TESTING.md`	Added shell pitfalls and cross-platform test-fixture guidance (NamedTemporaryFile rules, POSIX path formatting in Vera literals, and explicit `encoding="utf-8"`).
Release Notes & Known Issues `CHANGELOG.md`, `HISTORY.md`, `KNOWN_ISSUES.md`	Inserted v0.0.143 release entry and history row; updated compare links; replaced prior Windows advisory test-gap entries with requirement to use explicit `encoding='utf-8'` in text-mode file I/O.
Version Metadata `pyproject.toml`, `vera/__init__.py`, `README.md`	Bumped version from `0.0.142` → `0.0.143` in project metadata, exported constants, and project status text.

Sequence Diagram(s)

sequenceDiagram
  participant CI
  participant TestRunner
  participant VeraParser
  participant Filesystem
  CI->>TestRunner: start test job (PYTHONUTF8=1)
  TestRunner->>VeraParser: import -> _get_parser()
  VeraParser->>Filesystem: read_text("grammar.lark", encoding="utf-8")
  Filesystem-->>VeraParser: UTF-8 grammar
  TestRunner->>TestRunner: run tests (tempfile closed before subprocess)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Audit text-mode open()/read_text()/write_text() for explicit encoding='utf-8' (#641 follow-up) #645: Partial overlap — implements PYTHONUTF8 and explicit grammar UTF‑8 reading; per-site encoding audit remains.
test_io_read_file_roundtrip: Windows path embedded in Vera string literal triggers \U escape #642: Related — addresses Windows temp-path / \U escape test failures in fixtures.
Add windows-latest to CI test matrix (3 runners mirroring Ubuntu / macOS) #637: Related — both touch Windows CI matrix and runner behaviour.

Possibly related PRs

aallan/vera#643: Related — similar fixes to sanitize Windows file paths in tests.
aallan/vera#639: Related — that PR added Windows advisory behaviour which this PR removes.
aallan/vera#638: Related — overlaps KNOWN_ISSUES edits replaced by this PR.

Suggested labels

compiler, tests, ci, docs

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the three main changes: enabling UTF-8 on Windows (cp1252 fix), removing the continue-on-error advisory, and version bump to v0.0.143.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/issue-641-utf8-encoding

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-05-10T22:20:38Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.97%. Comparing base (6922fe6) to head (884aa0e).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #646   +/-   ##
=======================================
  Coverage   90.97%   90.97%           
=======================================
  Files          59       59           
  Lines       23137    23137           
  Branches      259      259           
=======================================
  Hits        21048    21048           
  Misses       2082     2082           
  Partials        7        7

Flag	Coverage Δ
javascript	`57.36% <ø> (ø)`
python	`94.76% <100.00%> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@HISTORY.md`:
- Line 331: Update the top header count in HISTORY.md to match the corrected
total of active development days: change the header line that currently reads
"56" to "57" so it matches the body line "Total: **810+ commits, 143 tagged
releases, 57 active development days.**"; locate and edit the header/title text
(the first few lines of HISTORY.md) to replace the outdated day-count with 57.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 55fb920e-8581-46c3-94cd-e58eb8ed9a20

📥 Commits

Reviewing files that changed from the base of the PR and between f148e54 and eb0fbc8.

⛔ Files ignored due to path filters (6)

docs/index.html is excluded by !docs/**
docs/index.md is excluded by !docs/**
docs/llms-full.txt is excluded by !docs/**
docs/llms.txt is excluded by !docs/**
docs/sitemap.xml is excluded by !docs/**
uv.lock is excluded by !**/*.lock, !uv.lock

📒 Files selected for processing (8)

.github/workflows/ci.yml
CHANGELOG.md
HISTORY.md
KNOWN_ISSUES.md
README.md
pyproject.toml
vera/__init__.py
vera/parser.py

Closes the third sub-issue from PR #639's matrix-expansion follow-up cluster (#641 cp1252 file I/O encoding) and flips the Windows matrix entries from advisory to strict. This is the "now works on Windows" release. Set PYTHONUTF8=1 in the CI test job environment so Python's text-mode open() defaults to UTF-8 regardless of locale (PEP 540). Covers all CI-side encoding failures with a single config change rather than auditing ~30 individual call sites. Closes ~9 failing tests across test_codegen.py, test_codegen_monomorphize.py, test_codegen_closures.py, test_html.py. Also added explicit encoding='utf-8' to vera/parser.py's grammar load — the load-bearing site that runs on every parse, so users on Windows without PYTHONUTF8=1 in their shell still get correct grammar loading (defense-in-depth). Filed #645 for the broader audit (every text-mode open() / read_text() / write_text() call site for explicit encoding) as the durable fix. Drop continue-on-error advisory mode PR #639 added Windows entries with continue-on-error: ${{ matrix.os == 'windows-latest' }} so failures didn't block merges while #640, sub-issues now closed (PR for #642 sanitises Windows paths in test fixtures; PR for #640 reads sys.stdin cross-platform; this PR closes #641). The continue-on-error line is removed in this release; the matrix is now fully strict across all 9 entries (ubuntu / macos / windows × 3.11 / 3.12 / 3.13). Version bump 0.0.142 → 0.0.143 This is the Windows-compatibility milestone — the project went from "untested on Windows" to "Windows is a strict gate" in three sequential PRs (#643 / #644 / this). Marking with a minor version bump and "now works on Windows" framing in CHANGELOG. Co-Authored-By: Claude <noreply@anthropic.invalid>

Header line 3 said 56 active development days; footer line 331 already correctly read 57 after v0.0.143 added a day. Header was the stale half. Trivial single-word fix. Co-Authored-By: Claude <noreply@anthropic.invalid>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@HISTORY.md`:
- Line 290: Update the v0.0.143 entry in HISTORY.md (the line containing
"v0.0.143 | 10 May") to remove the incorrect issue reference `#641` from the list
of closed issues so it only lists `#640` and `#642`; keep the rest of the text
intact (including the notes about Windows CI, `continue-on-error` advisory, and
the matrix) or remove the `#641` citation entirely if you prefer.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6c514633-6d12-40c6-8f0c-fb8fa8658704

📥 Commits

Reviewing files that changed from the base of the PR and between 676606f and 74948cb.

📒 Files selected for processing (1)

HISTORY.md

`tests/test_html.py::test_all_vera_blocks_check` and `test_all_vera_blocks_verify` used `tempfile.NamedTemporaryFile(..., delete=True)` then ran `vera <subcmd> f.name` via subprocess while the `with` block held the file handle open. On Windows you can't open a file twice while one handle is still held; the subprocess hit a Python traceback (truncated to "Traceback (most recent call last):" by the test's first-line error capture). This is the classic Windows `NamedTemporaryFile(delete=True)` gotcha — Unix allows concurrent file handles, Windows doesn't. All other test fixtures in the suite already use `delete=False` and clean up explicitly; only test_html.py had the legacy `delete=True` pattern. Fix: convert both sites to `delete=False` + manual `Path.unlink()` in a `finally` block. Closes the last 2 Windows-only test failures on PR #646; matrix should now go fully green across all 9 strict entries. Co-Authored-By: Claude <noreply@anthropic.invalid>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_html.py`:
- Around line 112-114: tempfile.NamedTemporaryFile is invoked twice in
tests/test_html.py (the NamedTemporaryFile call that assigns to f around lines
where mode="w", suffix=".vera", dir=str(ROOT), delete=False, and the second
similar call later) without an explicit encoding, which can cause non-ASCII
writes to fail on some locales; update both calls to include encoding='utf-8' so
text-mode writes use UTF-8 (i.e., add encoding='utf-8' to the
NamedTemporaryFile(...) arguments for the NamedTemporaryFile that creates the
Vera temp files).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1fde7034-e146-43a7-9026-94865157c15f

📥 Commits

Reviewing files that changed from the base of the PR and between 74948cb and 58c9dd5.

📒 Files selected for processing (2)

TESTING.md
tests/test_html.py

The post-#637 Windows CI rollout (PRs #639/#643/#644/#646) hit three test-fixture footguns — `tempfile.NamedTemporaryFile`'s delete-while-open behaviour, Windows paths embedded in Vera string literals tripping the `\U` escape, and cp1252 default file I/O encoding. Each had a portable workaround. Added a "Test Fixture Conventions" section to TESTING.md (immediately before "Adding Tests") with the three rules + their canonical wrong/right code snippets. Each rule cites the test that surfaced the bug + the PR that fixed it. Added a brief "Cross-platform pitfalls" section to CLAUDE.md's project-orientation footer that points at the TESTING.md section. The CLAUDE.md entry is a one-bullet pointer per rule, not a full repeat — keeping the substantive content in TESTING.md as the single source of truth. These notes will save the next contributor (or agent) from re-discovering the same footguns. Co-Authored-By: Claude <noreply@anthropic.invalid>

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CLAUDE.md`:
- Around line 183-185: The doc incorrectly suggests using
tmp_path.replace(os.sep, "/") on a pathlib.Path; replace any occurrences or
examples that call tmp_path.replace(...) with tmp_path.as_posix() so the pytest
tmp_path Path is converted to a POSIX string properly (e.g., where the guidance
talks about "Paths embedded into Vera string literals" use tmp_path.as_posix());
update the wording/examples around tmp_path to show .as_posix() and ensure any
surrounding explanation mentions converting the Path to a string for Vera
literals.

In `@TESTING.md`:
- Around line 337-338: Update TESTING.md so its Overview and CI Pipeline matrix
rows reflect that Windows is now part of the active CI matrix: find the phrases
"6 combinations (Ubuntu/macOS)" and the Overview/CI Pipeline rows that enumerate
platform combinations and change them to include Windows (and adjust the total
count accordingly), and update the section text that claims only Ubuntu/macOS to
state the current active matrix includes Windows as well; ensure the matrix
labels and counts are consistent with the new "Windows rollout" paragraph added
earlier in the file.
- Around line 371-372: The example uses tmp_path.replace(os.sep, "/") which
calls pathlib.Path.replace (wrong signature) and raises a TypeError; change the
conversion to POSIX to use the Path method tmp_path.as_posix() instead so
vera_path is a POSIX string (used when building source =
f'IO.read_file("{vera_path}")'); update the reference of vera_path to be
tmp_path.as_posix() everywhere.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 23b46a03-beb7-4be6-be3c-94ed07627c08

📥 Commits

Reviewing files that changed from the base of the PR and between 58c9dd5 and c8c063a.

📒 Files selected for processing (2)

CLAUDE.md
TESTING.md

…deRabbit) Both `NamedTemporaryFile` calls in test_html.py omitted the `encoding=` kwarg. Per the convention I just added to TESTING.md ("File I/O without explicit encoding falls back to the locale default"), the explicit form is the durable fix — CI sets PYTHONUTF8=1 as a backstop, but a local-developer run on Windows without that env var would hit cp1252 fallbacks if the HTML code blocks ever contain non-ASCII characters. Trivial — and consistent with the rule the same PR documents. Co-Authored-By: Claude <noreply@anthropic.invalid>

Three findings on PR #646's docs additions: 1. CLAUDE.md and TESTING.md examples used `tmp_path.replace(os.sep, "/")` to get POSIX form. Works on str, but BROKEN on pathlib.Path (Path.replace is the rename method — different signature, would silently move the file). Variable name `tmp_path` evokes pytest's Path-typed fixture. Switched to `Path(tmp_path).as_posix()` which works for both str and Path. The actual test code at tests/test_codegen.py:7760 / 7787 still uses str.replace because the variables there are str (from `tempfile.NamedTemporaryFile().name` and `os.path.join`) — that's correct in context, no change needed. 2. TESTING.md Overview row line 18 said "6 combinations (Ubuntu/macOS)" — pre-Windows. CI Pipeline section line 484 said the same. Both updated to "9 combinations (Ubuntu/macOS/Windows)". Same shape as the README test-count drift caught in PR #631 — matrix-related docs across multiple files don't auto-sync. Worth flagging the doc-counts script's coverage gap as a follow-up if it recurs. Co-Authored-By: Claude <noreply@anthropic.invalid>

coderabbitai Bot reviewed May 10, 2026

View reviewed changes

Comment thread HISTORY.md

aallan force-pushed the claude/issue-641-utf8-encoding branch from eb0fbc8 to 676606f Compare May 10, 2026 22:34

HISTORY: align header day-count with footer (CodeRabbit)

74948cb

Header line 3 said 56 active development days; footer line 331 already correctly read 57 after v0.0.143 added a day. Header was the stale half. Trivial single-word fix. Co-Authored-By: Claude <noreply@anthropic.invalid>

coderabbitai Bot reviewed May 10, 2026

View reviewed changes

Comment thread HISTORY.md

coderabbitai Bot reviewed May 10, 2026

View reviewed changes

Comment thread tests/test_html.py

coderabbitai Bot reviewed May 10, 2026

View reviewed changes

Comment thread CLAUDE.md

Comment thread TESTING.md

Comment thread TESTING.md Outdated

aallan and others added 2 commits May 11, 2026 00:01

aallan merged commit 2407624 into main May 10, 2026
23 checks passed

aallan deleted the claude/issue-641-utf8-encoding branch May 10, 2026 23:12

Conversation

aallan commented May 10, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changes

Why Option B (PYTHONUTF8=1) instead of Option A (audit each call site)

Test plan

Sibling PRs

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Uh oh!

codecov Bot commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aallan commented May 10, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 10, 2026 •

edited

Loading

codecov Bot commented May 10, 2026 •

edited

Loading