Updates test suite to be used with pytest-run-parallel#8065
Updates test suite to be used with pytest-run-parallel#8065ngoldbaum wants to merge 46 commits intoscikit-image:mainfrom
Conversation
Address failing tests Address failing test Mark test as thread_unsafe Mark test as thred_unsafe Debug via SSH Mark test_rotate as thread_unsafe
fixes quite a few conflicts
Merges edgar's work into current `main`
|
Not sure why the Mac CI is failing due to unregistered pytest marks. I can't reproduce it locally. I'll try to look closer tomorrow. Github actions was having issues earlier so I stopped working on this. |
src/skimage/conftest.py
Outdated
|
|
||
| def pytest_configure(config): | ||
| if not PARALLEL_RUN_AVAILABLE: | ||
| config.addinivalue_line( |
There was a problem hiding this comment.
Looks like the marker is defined optionally here. So, why is pytest_run_parallel not available on macos?
There was a problem hiding this comment.
This is how we set it up in NumPy, FWIW. I didn't want to make pytest_run_parallel a required test dependency for everyone who wants to test scikit-image.
There was a problem hiding this comment.
But I think you did identify the issue: scikit-image has two different conftest.py files! I totally missed that.
6b5ba0c to
acfad0c
Compare
|
OK, I think this is ready now. The pyodide failure looks unrelated. @stefanv can you let me know what you think? Also, while we're looking at this, maybe it makes sense to add 3.14 and 3.14t test jobs for the Mac and Windows CI as well? I don't think it makes sense to run pytest-run-parallel on more than one CI job though. |
📝 WalkthroughWalkthroughAdds thread-safety to internal modules and tests: serialized fetches, locked image-collection caching, thread-local curvature state, test annotations for thread-unsafe tests, deterministic local RNGs, per-thread test output paths, and a conditional CI step to run multithreaded tests with pytest-run-parallel. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
tests/skimage/feature/test_util.py (2)
166-170:⚠️ Potential issue | 🟡 MinorUse seeded
default_rng()for thread-safe parallel execution.This test uses
np.random.rand()andnp.random.randint()which access global RNG state, whereastest_plot_matched_featurescorrectly usesnp.random.default_rng(). Global RNG is not thread-safe and will cause non-deterministic behavior under pytest-run-parallel.🔧 Proposed fix
+ rng = np.random.default_rng(12345) - keypoints0 = 10 * np.random.rand(10, 2) - keypoints1 = 10 * np.random.rand(10, 2) - idxs0 = np.random.randint(10, size=10) - idxs1 = np.random.randint(10, size=10) + keypoints0 = 10 * rng.random((10, 2)) + keypoints1 = 10 * rng.random((10, 2)) + idxs0 = rng.integers(10, size=10) + idxs1 = rng.integers(10, size=10)
202-206:⚠️ Potential issue | 🟡 MinorUse seeded
default_rng()for thread-safe parallel execution.Same issue as above—global RNG functions are not thread-safe.
🔧 Proposed fix
+ rng = np.random.default_rng(67890) - keypoints0 = 10 * np.random.rand(10, 2) - keypoints1 = 10 * np.random.rand(10, 2) - idxs0 = np.random.randint(10, size=10) - idxs1 = np.random.randint(10, size=10) + keypoints0 = 10 * rng.random((10, 2)) + keypoints1 = 10 * rng.random((10, 2)) + idxs0 = rng.integers(10, size=10) + idxs1 = rng.integers(10, size=10)src/skimage/_shared/testing.py (1)
197-223:⚠️ Potential issue | 🟠 MajorMove the fetch serialization to
skimage.data._fetchers, not just this helper.
fetch()is only one entry point.src/skimage/data/_fetchers.py::_load()(used byskimage.data.astronaut()and similar),download_all(),lbp_frontal_face_cascade_filename(), andlfw_subset()still call_fetch()directly, so parallel tests that go throughskimage.data.*bypass_FETCH_LOCKentirely. That leaves the same download/cache race in place despite this change.
🧹 Nitpick comments (2)
.github/workflows/test-linux.yaml (1)
131-135: Consider whether--doctest-plusshould be included for consistency.The regular test step (line 129) includes
--doctest-plus, but this multithreaded step omits it. If doctests are intentionally excluded due to thread-safety concerns, adding a comment would clarify the rationale. Otherwise, this might be an oversight.tests/skimage/conftest.py (1)
10-15: Consider usingImportErrorinstead of broadException.Catching
ImportError(orModuleNotFoundError) is more specific for import failures and addresses the static analysis hint. However, the current approach is acceptable if there are edge cases where other exceptions could occur during plugin initialization.Suggested refinement
try: import pytest_run_parallel # noqa:F401 PARALLEL_RUN_AVAILABLE = True -except Exception: +except ImportError: PARALLEL_RUN_AVAILABLE = False
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d12e215a-dcc7-4749-b7cf-e96e59c7a892
📒 Files selected for processing (19)
.github/workflows/test-linux.yamlsrc/skimage/_shared/testing.pysrc/skimage/io/collection.pysrc/skimage/segmentation/morphsnakes.pytests/skimage/_shared/test_testing.pytests/skimage/conftest.pytests/skimage/data/test_data.pytests/skimage/feature/test_corner.pytests/skimage/feature/test_haar.pytests/skimage/feature/test_util.pytests/skimage/filters/rank/test_rank.pytests/skimage/io/test_imageio.pytests/skimage/io/test_plugin.pytests/skimage/segmentation/test_boundaries.pytests/skimage/segmentation/test_random_walker.pytests/skimage/transform/test_radon_transform.pytests/skimage/transform/test_thin_plate_splines.pytests/skimage/util/test_dtype.pytests/skimage2/test_skimage2.py
💤 Files with no reviewable changes (1)
- tests/skimage/transform/test_thin_plate_splines.py
| fname, img_num = self._frame_index[n] | ||
| if img_num is not None: | ||
| kwargs['img_num'] = img_num | ||
| try: | ||
| self.data[idx] = self.load_func(fname, **kwargs) | ||
| else: | ||
| raise | ||
| else: | ||
| self.data[idx] = self.load_func(self.files[n], **kwargs) | ||
| self._cached = n | ||
|
|
||
| return self.data[idx] | ||
| # Account for functions that do not accept an img_num kwarg | ||
| except TypeError as e: | ||
| if "unexpected keyword argument 'img_num'" in str(e): | ||
| del kwargs['img_num'] | ||
| self.data[idx] = self.load_func(fname, **kwargs) | ||
| else: |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "ImageCollection multi-frame path:"
rg -n -C3 "_frame_index|img_num|unexpected keyword argument 'img_num'" src/skimage/io/collection.py
echo
echo "Plugin imread definitions and img_num handling:"
fd '_plugin\.py$' src/skimage/io/_plugins -x sh -c '
echo "== $1 =="
rg -n -C2 "def imread\(" "$1"
rg -n -C2 "\bimg_num\b" "$1" || true
' sh {}Repository: scikit-image/scikit-image
Length of output: 4445
Don't fall back to loading without img_num on the multi-frame path.
We only enter this block after n has already been resolved through self._frame_index, i.e. a specific frame inside a multi-frame file is being requested. Retrying without img_num silently changes the semantics for n > 0 and can return the backend's default frame instead of the requested one.
Several shipped backends do not accept img_num (simpleitk, fits, imread_plugin, gdal), so this fallback will always trigger for them on multi-frame paths, causing silent data loss when accessing frames other than the first.
There was a problem hiding this comment.
This is commenting on pre-existing issues, so I'm ignoring it.
| _curvop = _fcycle( | ||
| [lambda u: sup_inf(inf_sup(u)), lambda u: inf_sup(sup_inf(u))] # SIoIS | ||
| ) # ISoSI | ||
| _curvop = threading.local() |
There was a problem hiding this comment.
Reset the curvature operator on each API call.
threading.local() only isolates threads. It does not reset the _fcycle between calls on the same worker thread, so the next invocation starts on the opposite operator whenever the previous call consumed an odd number of smoothing steps. That makes results call-order-dependent.
Suggested fix
-_curvop = threading.local()
+_curvop = threading.local()
@@
- if not hasattr(_curvop, 'c'):
- _curvop.c = _fcycle(
- [
- lambda u: sup_inf(inf_sup(u)), # SIoIS
- lambda u: inf_sup(sup_inf(u)),
- ]
- ) # ISoSI
+ curvop = _fcycle(
+ [
+ lambda u: sup_inf(inf_sup(u)), # SIoIS
+ lambda u: inf_sup(sup_inf(u)),
+ ]
+ ) # ISoSI
@@
- u = _curvop.c(u)
+ u = curvop(u)
@@
- if not hasattr(_curvop, 'c'):
- _curvop.c = _fcycle(
- [
- lambda u: sup_inf(inf_sup(u)), # SIoIS
- lambda u: inf_sup(sup_inf(u)),
- ]
- ) # ISoSI
+ curvop = _fcycle(
+ [
+ lambda u: sup_inf(inf_sup(u)), # SIoIS
+ lambda u: inf_sup(sup_inf(u)),
+ ]
+ ) # ISoSI
@@
- u = _curvop.c(u)
+ u = curvop(u)Also applies to: 290-296, 321-322, 414-420, 458-459
There was a problem hiding this comment.
The suggested diff is nonsense. While it's correct that making this state be thread-local doesn't fix the fact that API calls won't be independent, that's also the pre-existing single-threaded behavior.
A better way to fix this is probably to store the _fcycle state in a context variable and use with statements in the implementation to manage the context state. One way to think of context variables is that they're "souped up" thread locals that are async-local as well as thread-local and that work syntactically with the Python with statement.
I could do that refactor, but I'd rather not because the preexisting behavior is probably fine for most uses.
tests/skimage/conftest.py
Outdated
| def pytest_configure(config): | ||
| if not PARALLEL_RUN_AVAILABLE: | ||
| config.addinivalue_line( | ||
| "markers", | ||
| "thread_unsafe: mark the test function as single-threaded", | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether tests/skimage2/ has its own conftest.py that registers the marker
fd -t f 'conftest.py' tests/
# Check if thread_unsafe marker is registered elsewhere
rg -n "thread_unsafe" pyproject.toml tests/conftest.py 2>/dev/null || echo "Not found in pyproject.toml or tests/conftest.py"Repository: scikit-image/scikit-image
Length of output: 161
🏁 Script executed:
# Check if tests/skimage2/ exists and what files are in it
fd -t f . tests/skimage2/ 2>/dev/null || echo "tests/skimage2/ does not exist or is empty"
# Search for thread_unsafe marker usage across all test files
rg "thread_unsafe" tests/ -t py
# Verify --strict-markers in pyproject.toml
rg "strict-markers" pyproject.tomlRepository: scikit-image/scikit-image
Length of output: 1659
🏁 Script executed:
# Confirm pytest's strict-markers setting in pyproject.toml
rg -A 2 -B 2 "strict-markers" pyproject.toml
# Also verify if tests/conftest.py might register marker differently
head -50 tests/conftest.py | grep -i "thread_unsafe\|pytest_configure"Repository: scikit-image/scikit-image
Length of output: 216
Marker registration does not cover tests/skimage2/ tests.
The thread_unsafe marker is registered only in tests/skimage/conftest.py, which applies only to tests in the tests/skimage/ subtree. However, tests/skimage2/test_skimage2.py uses @pytest.mark.thread_unsafe, and since tests/skimage2/ is a sibling directory (not a subdirectory of tests/skimage/), it does not inherit this marker registration.
With --strict-markers enabled in pyproject.toml, this will cause pytest to report an unknown marker error when running tests in tests/skimage2/.
Move this marker registration to tests/conftest.py (root-level conftest) so it applies to all test subdirectories.
There was a problem hiding this comment.
This is a real issue.
| @pytest.mark.thread_unsafe("Tests monkeypatch global plugins state") | ||
| @pytest.mark.skipif(not has_mpl, reason="matplotlib not installed") |
There was a problem hiding this comment.
Mark the whole module thread-unsafe.
The autouse _use_pil_plugin() fixture mutates global plugin state before and after every test in this file. test_failed_use() and the parametrized test_deprecation_warnings_on_plugin_funcs() still run under that fixture but remain unmarked, so this module can still race under pytest-run-parallel. A module-level thread_unsafe marker is safer than tagging only three tests.
Suggested adjustment
+pytestmark = pytest.mark.thread_unsafe(
+ reason="module mutates global plugin state via _use_pil_plugin"
+)
+
`@pytest.fixture`(autouse=True)
def _use_pil_plugin():
io.use_plugin('pil')
yield
io.reset_plugins()Also applies to: 48-49, 64-65
There was a problem hiding this comment.
I disagree that a module-level mark is better, since that means the other tests in this file which are currently being run won't be run.
| def make_2d_syntheticdata(lx, ly=None, rng=None): | ||
| if ly is None: | ||
| ly = lx | ||
| np.random.seed(1234) | ||
| data = np.zeros((lx, ly)) + 0.1 * np.random.randn(lx, ly) | ||
| if rng is None: | ||
| rng = np.random.RandomState(1234) | ||
| data = np.zeros((lx, ly)) + 0.1 * rng.randn(lx, ly) | ||
| small_l = int(lx // 5) | ||
| data[ | ||
| lx // 2 - small_l : lx // 2 + small_l, ly // 2 - small_l : ly // 2 + small_l | ||
| ] = 1 | ||
| data[ | ||
| lx // 2 - small_l + 1 : lx // 2 + small_l - 1, | ||
| ly // 2 - small_l + 1 : ly // 2 + small_l - 1, | ||
| ] = 0.1 * np.random.randn(2 * small_l - 2, 2 * small_l - 2) | ||
| ] = 0.1 * rng.randn(2 * small_l - 2, 2 * small_l - 2) |
There was a problem hiding this comment.
Add a regression test for the new rng hook.
These helpers now expose behavior that is central to the parallelization work, but the changed tests only cover the default seeded path. Please add one direct test that passes explicit RNG instances into both helpers and asserts identical output for identically seeded RNGs, so a future refactor cannot silently ignore the injected RNG.
Suggested test shape
+@testing.parametrize(
+ "make_data, shape",
+ [
+ (make_2d_syntheticdata, (70, 100)),
+ (make_3d_syntheticdata, (30, 30, 30)),
+ ],
+)
+def test_syntheticdata_respects_explicit_rng(make_data, shape):
+ rng1 = np.random.RandomState(0)
+ rng2 = np.random.RandomState(0)
+
+ data1, seeds1 = make_data(*shape, rng=rng1)
+ data2, seeds2 = make_data(*shape, rng=rng2)
+
+ np.testing.assert_array_equal(data1, data2)
+ np.testing.assert_array_equal(seeds1, seeds2)As per coding guidelines, **/*.py: All code should have tests covering all modifications in behavior.
Also applies to: 32-39
There was a problem hiding this comment.
@stefanv or another reviewer, please let me know if you want me to do this
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/conftest.py (1)
83-88: Narrow the exception type toImportError.Catching broad
Exceptioncan mask unexpected errors during import. Since you're specifically checking for module availability, useImportError(orModuleNotFoundErrorfor Python 3.6+).♻️ Proposed fix
try: import pytest_run_parallel # noqa:F401 PARALLEL_RUN_AVAILABLE = True -except Exception: +except ImportError: PARALLEL_RUN_AVAILABLE = False
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6b9a1cf0-65ed-4538-8a54-96e31929f88c
📒 Files selected for processing (1)
tests/conftest.py
e6a8b12 to
d05feaa
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
.github/workflows/test-linux.yaml (1)
131-136: Consider pinningpytest-run-parallelversion and addingEAGER_IMPORTenv var for consistency.
Unpinned dependency: Installing
pytest-run-parallelwithout a version pin could cause unexpected CI failures on breaking releases. Currently at version 0.8.2.Missing
EAGER_IMPORTenv var: The "Run tests" step explicitly setsEAGER_IMPORT: ${{ matrix.EAGER_IMPORT }}(line 127). While the multithreaded step inherits the matrix default value of0, adding it here maintains consistency across test steps.Suggested improvement
- name: Run multithreaded tests (3.14t only) if: ${{ matrix.python-version == '3.14t' }} + env: + EAGER_IMPORT: ${{ matrix.EAGER_IMPORT }} run: | - pip install pytest-run-parallel + pip install pytest-run-parallel>=0.8 $PYTEST --doctest-plus --showlocals --pyargs skimage --parallel-threads=4 ./tests
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2a43967e-d23e-4933-86f5-f87770d9d8fc
📒 Files selected for processing (2)
.github/workflows/test-linux.yamltests/conftest.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/conftest.py
|
It looks like using the numpy global RNG in tests might cause some flakiness if this PR gets merged as-is. Maybe I should just globally replace it with a local per-test RandomState every time the global RNG is used in scikit-image tests? That'd probably have to be it's own PR, I'd be touching a lot of tests. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/skimage/metrics/test_structural_similarity.py (1)
148-158: Rename unused loop variablendimto_ndim.The loop variable
ndimis not used within the loop body. Prefix with underscore to indicate it's intentionally unused.♻️ Proposed fix
rng = np.random.RandomState(42) - for ndim in range(1, 5): + for _ndim in range(1, 5): xsize = [ N, ] * 5
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: beb533b8-cd80-42a9-83d9-ec83e8dda7c6
📒 Files selected for processing (1)
tests/skimage/metrics/test_structural_similarity.py
| cam = data.camera() | ||
| sigma = 20.0 | ||
| cam_noisy = np.clip(cam + sigma * np.random.randn(*cam.shape), 0, 255) | ||
| rng = np.random.RandomState(5) |
There was a problem hiding this comment.
Why not np.random.default_rng(5)?
There was a problem hiding this comment.
Becuase default_rng isn't subject to the strict backward compatibility guarantees from RandomState. See numpy/numpy#29729 for where we did a similar refactor in NumPy. Robert Kern wrote the new RNG infrastructure so I generally defer to his advice on this stuff.
There was a problem hiding this comment.
In an ideal world, we wouldn't even set any seeds in our test suite; but we do it to avoid running into edge cases. I think any seed that works is fine (and if you find one that isn't, then we want to know about that too!).
|
A good idea to use local RNG states, as that would also presumably help pave the way towards a parallelized test suite, using pytest-xdist or similar. |
Description
Supercedes #7678.
Updates the test suite to be runnable under pytest-run-parallel.
Checklist
./doc/examplesfor new featuresRelease note
For maintainers and optionally contributors, please refer to the instructions on how to document this PR for the release notes.