docs: Correct workflow.source_path() description in documentation#4036
docs: Correct workflow.source_path() description in documentation#4036johanneskoester merged 6 commits intosnakemake:mainfrom
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughDocs updated to instruct using Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/snakefiles/deployment.rst`:
- Around line 464-465: The note incorrectly implies any file in params causes
reruns; change the explanation to state that only using workflow.source_path
within params produces a cached path that changes on each start and thus
triggers reruns. Update the text to explicitly mention workflow.source_path as
the cause, advise using workflow.source_path in input rather than params, and
keep references to params, workflow.source_path, and input so reviewers can find
and update the sentence accordingly.
In `@tests/tests.py`:
- Around line 1997-2012: The test is flaky due to filesystem timestamp
resolution: after calling run(dpath("test_source_path_rerun"), cleanup=False)
then rerunning into tmpdir the mtime check between mtime_param_1 and
mtime_param_2 can be identical; to fix, import time and insert a short sleep
(e.g., time.sleep(1)) after the first run (before computing mtime_param_1 or
before the second run) or, alternatively, use higher-resolution timestamps via
os.stat(...).st_mtime_ns when computing mtime_param_1/mtime_param_2; also ensure
the temporary workspace(s) created when cleanup=False (path1 and path2 returned
by run) are removed in a finally block (use shutil.rmtree on path1/path2) so the
tempdirs are cleaned up even on assertion failure—look for run, dpath,
path1/path2, mtime_param_1/mtime_param_2 to apply these changes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 542026fa-58e8-4cfc-9bda-f8017b7b486c
⛔ Files ignored due to path filters (1)
tests/test_source_path_parent/expected-results/test.outis excluded by!**/*.out
📒 Files selected for processing (4)
docs/snakefiles/deployment.rstdocs/snakefiles/rules.rsttests/test_source_path_parent/resources/test.txttests/tests.py
tests/tests.py
Outdated
| path1 = run(dpath("test_source_path_rerun"), cleanup=False) | ||
|
|
||
| mtime_param_1 = os.path.getmtime(os.path.join(path1, "test.out")) | ||
|
|
||
| mtime_input_1 = os.path.getmtime(os.path.join(path1, "test2.out")) | ||
|
|
||
| path2 = run(dpath("test_source_path_rerun"), tmpdir=path1, cleanup=False) | ||
|
|
||
| mtime_param_2 = os.path.getmtime(os.path.join(path2, "test.out")) | ||
|
|
||
| mtime_input_2 = os.path.getmtime(os.path.join(path2, "test2.out")) | ||
|
|
||
| # rerun of the same workflow with source_path as param results in a rerun | ||
| assert mtime_param_1 != mtime_param_2 | ||
| # rerun of the same workflow with source_path as input results in no rerun | ||
| assert mtime_input_1 == mtime_input_2 |
There was a problem hiding this comment.
Make this rerun assertion robust across filesystem timestamp resolution.
os.path.getmtime() can return the same value for two back-to-back runs on coarse-resolution filesystems, so Line 2010 can fail even when test.out was rebuilt. Since this test also keeps the workspace around with cleanup=False, it should clean that tempdir up in a finally block.
🧪 Suggested hardening
Add near the other stdlib imports:
import time def test_source_path_rerun():
path1 = run(dpath("test_source_path_rerun"), cleanup=False)
-
- mtime_param_1 = os.path.getmtime(os.path.join(path1, "test.out"))
-
- mtime_input_1 = os.path.getmtime(os.path.join(path1, "test2.out"))
-
- path2 = run(dpath("test_source_path_rerun"), tmpdir=path1, cleanup=False)
-
- mtime_param_2 = os.path.getmtime(os.path.join(path2, "test.out"))
-
- mtime_input_2 = os.path.getmtime(os.path.join(path2, "test2.out"))
-
- # rerun of the same workflow with source_path as param results in a rerun
- assert mtime_param_1 != mtime_param_2
- # rerun of the same workflow with source_path as input results in no rerun
- assert mtime_input_1 == mtime_input_2
+ try:
+ param_out = os.path.join(path1, "test.out")
+ input_out = os.path.join(path1, "test2.out")
+
+ mtime_param_1 = os.stat(param_out).st_mtime_ns
+ mtime_input_1 = os.stat(input_out).st_mtime_ns
+
+ time.sleep(1.1)
+ run(dpath("test_source_path_rerun"), tmpdir=path1, cleanup=False)
+
+ mtime_param_2 = os.stat(param_out).st_mtime_ns
+ mtime_input_2 = os.stat(input_out).st_mtime_ns
+
+ # rerun of the same workflow with source_path as param results in a rerun
+ assert mtime_param_1 != mtime_param_2
+ # rerun of the same workflow with source_path as input results in no rerun
+ assert mtime_input_1 == mtime_input_2
+ finally:
+ shutil.rmtree(path1, ignore_errors=ON_WINDOWS)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/tests.py` around lines 1997 - 2012, The test is flaky due to filesystem
timestamp resolution: after calling run(dpath("test_source_path_rerun"),
cleanup=False) then rerunning into tmpdir the mtime check between mtime_param_1
and mtime_param_2 can be identical; to fix, import time and insert a short sleep
(e.g., time.sleep(1)) after the first run (before computing mtime_param_1 or
before the second run) or, alternatively, use higher-resolution timestamps via
os.stat(...).st_mtime_ns when computing mtime_param_1/mtime_param_2; also ensure
the temporary workspace(s) created when cleanup=False (path1 and path2 returned
by run) are removed in a finally block (use shutil.rmtree on path1/path2) so the
tempdirs are cleaned up even on assertion failure—look for run, dpath,
path1/path2, mtime_param_1/mtime_param_2 to apply these changes.
3279eaa to
b94d84b
Compare
* Check if paths of parent directories are correctly resolved * Check that no rerun is triggered if workflow.source_path is used in input and that a rerun is triggered if workflow.source_path is used in params.
b94d84b to
47f0852
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/test_source_path_parent/nest/subdir/Snakefile (1)
6-8: Drop the extraechounless the test asserts logs.It looks like a debug artifact and just adds noise to test output; the
catalone already exercises the behavior under test.♻️ Suggested cleanup
shell: - "echo {input.a};" "cat {input.a} > {output}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_source_path_parent/nest/subdir/Snakefile` around lines 6 - 8, The shell step in the Snakefile contains an extra debug command ("echo {input.a};") that should be removed; update the shell: block in tests/test_source_path_parent/nest/subdir/Snakefile to only run the actual command (keep "cat {input.a} > {output}") so the test output isn’t noisy and the behavior under test is still exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/test_source_path_parent/nest/subdir/Snakefile`:
- Around line 6-8: The shell step in the Snakefile contains an extra debug
command ("echo {input.a};") that should be removed; update the shell: block in
tests/test_source_path_parent/nest/subdir/Snakefile to only run the actual
command (keep "cat {input.a} > {output}") so the test output isn’t noisy and the
behavior under test is still exercised.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1face5f4-685b-4e2e-b280-ec775fa9fc73
⛔ Files ignored due to path filters (2)
tests/test_source_path_parent/expected-results/test.outis excluded by!**/*.outtests/test_source_path_rerun/expected-results/test.outis excluded by!**/*.out
📒 Files selected for processing (7)
docs/snakefiles/deployment.rstdocs/snakefiles/rules.rsttests/test_source_path_parent/nest/subdir/Snakefiletests/test_source_path_parent/resources/test.txttests/test_source_path_rerun/Snakefiletests/test_source_path_rerun/resources/test.txttests/tests.py
✅ Files skipped from review due to trivial changes (1)
- tests/test_source_path_rerun/resources/test.txt
🚧 Files skipped from review as they are similar to previous changes (3)
- docs/snakefiles/deployment.rst
- docs/snakefiles/rules.rst
- tests/tests.py
🤖 I have created a release *beep* *boop* --- ## [9.17.0](v9.16.3...v9.17.0) (2026-03-13) ### Features * Allow storing snakemake metadata in files or databases ([#4012](#4012)) ([dd75f31](dd75f31)) * Allow to specify comparison command per-unit test ([#3956](#3956)) ([b88171c](b88171c)) * job table orderd topological when run is started ([#4018](#4018)) ([75cf506](75cf506)) * lambda functions for priority in rules ([#3253](#3253)) ([d2aa226](d2aa226)) * Make on... directive of modules accessible ([#4050](#4050)) ([e9f2e1c](e9f2e1c)) ### Bug Fixes * adjust conda tests to not fail on apple silicon; fix [#4040](#4040) ([#4049](#4049)) ([f5b0142](f5b0142)) * allow "--containerize apptainer" to output apptainer format instead of dockerfile ([#4030](#4030)) ([f5cac30](f5cac30)) * apptainer command not recognized when singularity is absent ([#4010](#4010)) ([b8162e2](b8162e2)) * capture stderr when tests fail ([#3995](#3995)) ([97d74ba](97d74ba)) * **docs:** make Data-dependent conditional execution a complete example ([#4043](#4043)) ([3a1d7f2](3a1d7f2)) * don't build the DAG when running unlock. Fixes [#4000](#4000) and [#198](#198) ([#4007](#4007)) ([acf79fd](acf79fd)) * Ensure pixi tasks may be run as advertised ([#4046](#4046)) ([88253c2](88253c2)) * fix checkpoint handling corner cases ([#3870](#3870) and [#3559](#3559)) ([#4015](#4015)) ([63f4257](63f4257)) * issue 3642 ([#4054](#4054)) ([76e6fc2](76e6fc2)) * issue 3815 ([#4026](#4026)) ([b0eec96](b0eec96)) * logging None in shellcmd context causes error ([#4064](#4064)) ([d0652cd](d0652cd)) * lookup function returns default value for empty DataFrame queries ([#4056](#4056)) ([f71de97](f71de97)) * make `cache: omit-software` a rule specific property ([#4085](#4085)) ([034a9e7](034a9e7)) * reduce number of tests leaving temporary files behind ([#4033](#4033)) ([a3a1c97](a3a1c97)) * regression in dynamic resource handling ([#4038](#4038)) ([f2c554a](f2c554a)) * somewhat shorter announce message ([#4080](#4080)) ([57efc71](57efc71)) ### Performance Improvements * switch reretry with tenacity; decouple container classes (with Python 3.7 compat for old scripts) from rest of the codebase (enabling moving to newer python versions) ([#4032](#4032)) ([ffb19e7](ffb19e7)) ### Documentation * Add AI-assisted contributions policy to contributing guidelines ([#4051](#4051)) ([dd70526](dd70526)) * **codebase:** Update & simplify plugin architecture section ([#4052](#4052)) ([176cf63](176cf63)) * Correct workflow.source_path() description in documentation ([#4036](#4036)) ([45883c5](45883c5)) * fixed wrong code example for collect() function ([#4037](#4037)) ([5c85ed8](5c85ed8)) * Minor docs improvements ([#4089](#4089)) ([29ea226](29ea226)) * switch to sphinx_design for tabs ([#3976](#3976)) ([9674614](9674614)) * typo in the migration table breaking a pip install command ([#4024](#4024)) ([66f9dda](66f9dda)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
This PR is intended to clarify the workflow.source_path() documentation and additionally ensures the intended functionality with two new tests. It thereby addresses the following issues: #1805, #2331.
In the documentation, the usage of workflow.source_path() is clarified by specifying what that when workflow.source_path() is used in
inputno rerun is triggered while the use inparamscauses a rerun of the given rule upon every start of the pipeline.The newly introduced tests check:
inputbut it does if it is used inparamsQC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit