Skip to content

test: replace manual state-reset fixtures with pytest-isolate#28956

Closed
ethernet8023 wants to merge 1 commit into
mainfrom
change/ci-test-parallelism
Closed

test: replace manual state-reset fixtures with pytest-isolate#28956
ethernet8023 wants to merge 1 commit into
mainfrom
change/ci-test-parallelism

Conversation

@ethernet8023

Copy link
Copy Markdown
Collaborator

What does this PR do?

Replaces the ~130-line manual "reset module state" machinery in tests/conftest.py with pytest-isolate, which forks each test into its ownsubprocess. This eliminates all cross-test state leakage (module-level dicts, sets, ContextVars) at the process boundary level — the old fixturewas a best-effort workaround that grew brittle as new state buckets were added.

With true isolation in place, the xdist worker cap is lifted from 4 to auto. Developers on big machines finally get parallelism proportional totheir core count without "works alone, flakes in CI" bugs.

Related Issue

Fixes #

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • pyproject.toml: added --isolate to pytest addopts, set isolate_timeout = "30", added pytest-isolate>=0.0.13 to dev deps

  • tests/conftest.py: removed _reset_module_state autouse fixture (~120 lines clearing approval state, gateway ContextVars, terminal env caches,file-ops caches, etc.), removed _enforce_test_timeout + _timeout_handler, removed _reset_tool_registry_caches: replaced by process-per-test isolation. Cleaned up unused logging, signal, tempfile imports.

  • scripts/run_tests.sh: bumped default WORKERS from 4 to auto, updated header comment

How to Test

  1. Run scripts/run_tests.sh and verify the suite passes at full parallelism
  2. Run a historically flaky module at high worker count to confirm isolation works: python -m pytest tests/tools/test_approval.py -n 16 --tb=short

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: NixOS

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

@ethernet8023 ethernet8023 requested a review from a team May 19, 2026 21:56
@github-actions

Copy link
Copy Markdown
Contributor

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

@github-actions

Copy link
Copy Markdown
Contributor

⚠️ Unbounded PyPI Dependency Detected

This PR adds PyPI dependencies without a <next_major upper bound. Per our supply chain policy, all PyPI deps must be pinned as >=floor,<next_major.

Unbounded specs found:

"pytest-isolate>=0.0.13"

Fix: Add an upper bound, e.g. "package>=1.2.0,<2"


See PR #2810 and CONTRIBUTING.md for the full policy rationale.

@github-actions

github-actions Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: change/ci-test-parallelism vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8985 on HEAD, 8985 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4736 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@ethernet8023 ethernet8023 force-pushed the change/ci-test-parallelism branch from 87ba526 to 8aef067 Compare May 19, 2026 21:57
@github-actions

Copy link
Copy Markdown
Contributor

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

@alt-glitch alt-glitch added type/test Test coverage or test infrastructure type/refactor Code restructuring, no behavior change P3 Low — cosmetic, nice to have labels May 19, 2026
Each test now runs in a forked subprocess (--isolate), so module-level
dicts/sets/ContextVars can no longer leak between tests on the same
xdist worker. This eliminates the need for:

- _reset_module_state autouse fixture (~120 lines clearing approval
  state, gateway session ContextVars, terminal env caches, etc.)
- _enforce_test_timeout + _timeout_handler (SIGALRM-based 30s timeout)
- _reset_tool_registry_caches (registry + tool defs cache invalidation)

With isolation in place, bump the xdist worker cap from 4 to auto in
scripts/run_tests.sh and pyproject.toml addopts. Developers on big
machines finally get parallelism proportional to their core count.
@ethernet8023 ethernet8023 force-pushed the change/ci-test-parallelism branch from 8aef067 to 48a1fa9 Compare May 19, 2026 22:05
@github-actions

Copy link
Copy Markdown
Contributor

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

@AJV20

AJV20 commented May 19, 2026

Copy link
Copy Markdown

Opened #28966 as a branch updated onto current origin/main to clear the false-positive supply-chain comments here. The old BASE..HEAD comparison in CI no longer includes hermes_cli/setup.py, and the added pytest-isolate dependency remains exact-pinned (==0.0.13).

@ethernet8023 ethernet8023 deleted the change/ci-test-parallelism branch May 22, 2026 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low — cosmetic, nice to have type/refactor Code restructuring, no behavior change type/test Test coverage or test infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants