Skip to content

Bump test_browser.py Node subprocess timeout from 30s to 60s to reduce Windows CI flakes #694

@aallan

Description

@aallan

Symptom

tests/test_browser.py::test_stdout_parity[hello_world] intermittently fails on test (windows-latest, 3.12) with subprocess.TimeoutExpired after 30 seconds. Most recently seen on PR #693 (CI run 26279976744) where 3.11 and 3.13 passed on the same Windows runner but 3.12 hit the timeout on a trivial hello_world program:

FAILED tests/test_browser.py::test_stdout_parity[hello_world]
  - subprocess.TimeoutExpired: Command
    '['C:\\Program Files\\nodejs\\node.EXE',
      '--experimental-wasm-exnref',
      'D:\\a\\vera\\vera\\vera\\browser\\harness.mjs',
      'C:\\Users\\runneradmin\\AppData\\Local\\Temp\\...\\test.wasm']'
    timed out after 30 seconds
=========== 1 failed, 3913 passed, 15 skipped in 160.54s ============

A re-run of just that job passed cleanly, confirming this is a flake rather than a regression.

Cause

Node startup on Windows GitHub Actions runners is highly variable (cold-cache disk reads, antivirus scans, etc.), and the --experimental-wasm-exnref flag triggers extra V8 codegen on first execution. The 30-second timeout in tests/test_browser.py (two sites: lines 160 and 1736 — timeout=30 arguments to subprocess.run) has insufficient headroom for cold Node startup on slow runner instances.

Asymmetry test (why this is a flake, not a regression)

If the timeout were caused by user-code changes, all three Windows-Python matrix cells (3.11 / 3.12 / 3.13) would fail symmetrically — they share the same WASM module, the same harness.mjs, the same Node binary on the same runner. Instead only one cell fails at a time, and which one varies between CI runs. That's the signature of cold-cache Node startup pushing one matrix cell over the timeout while the others, running back-to-back on the same machine, benefit from a warm cache.

Fix

Bump both timeout=30 literals in tests/test_browser.py (lines 160 and 1736) to timeout=60. Trivial change, no other knock-on. The longer timeout doesn't slow CI when tests pass — subprocess.run's timeout only matters when something is wrong.

- timeout=30,
+ timeout=60,

Out of scope (deliberately)

This is purely a CI-stability follow-up. The underlying Node-startup latency is a Windows-runner thing, not a Vera bug. If the flake recurs after the bump:

  1. Escalate to a marker-based skip on Windows (@pytest.mark.skipif(sys.platform == "win32", reason="...") on the affected tests) — accepts the coverage loss for stability.
  2. Or investigate a faster Node startup path — for instance, only emit --experimental-wasm-exnref for modules that actually use exception-handling instructions, since hello_world doesn't.

Both are larger changes than a literal bump and shouldn't block the simple fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions