docs: output files within output directories is an error (#2848)#2913
docs: output files within output directories is an error (#2848)#2913johanneskoester merged 10 commits intosnakemake:mainfrom
Conversation
|
|
There is |
|
|
|
I see. So let's be happy with the kind of test we have for now, and happy to take a separate PR that refines the testing in the future. |
|
Mhm, sorry, did I break something with my little change? |
📝 WalkthroughWalkthroughAdds two documentation sidebars warning about directory outputs, a new Snakefile reproducing issue 2848, an expected-fail test targeting that workflow, and a change in check_directory_outputs to use pathlib.Path.absolute() instead of os.path.abspath(). Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (55)
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
docs/snakefiles/rules.rst(1 hunks)snakemake/dag.py(1 hunks)tests/test_github_issue2848/Snakefile(1 hunks)tests/tests.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- snakemake/dag.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
tests/tests.py
🪛 GitHub Actions: docs
docs/snakefiles/rules.rst
[error] 1776-1776: Content block expected for the 'sidebar' directive; none found.
🔇 Additional comments (5)
tests/test_github_issue2848/Snakefile (4)
1-1: LGTM!Setting the shell executable to "bash" is appropriate for the test case.
3-6: LGTM!Rule
acorrectly creates the initial file that will be used as input for ruleb.
15-20: LGTM!Rules
canddare correctly defined to complete the test case. They are included in the test targets but won't be reached due to the expected parsing failure.Also applies to: 21-24
7-14: Verify that this rule triggers the expected error.Rule
bis designed to fail during parsing because it creates a directory that would contain its own input file. This is the core of the test case, verifying that such invalid configurations are caught early.tests/tests.py (1)
2330-2336: LGTM!The test function is well-designed to verify the fix:
- The comment clearly documents the expected behavior (fail during parsing).
- The test targets are appropriately set to trigger the invalid configuration.
- The
shouldfail=Trueparameter correctly indicates that the test should fail.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
|
@NelisDrost can you fix the conflicts? Just so I can merge and close this PR. |
|
Done that now! Thanks a lot everybody! |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/snakemake/dag.py (1)
299-312: Prefer pathlib-relative check and skip non-local outputsAvoid string conversions and special-casing with os.path.commonpath; use Path.relative_to() and ignore storage URLs to keep this purely a local-FS concern.
Apply this diff:
- """Check that no output file is contained in a directory output of the same or another rule.""" + """Check that no output file is contained in a directory output of another rule.""" outputs = sorted( - {(Path(f).absolute(), job) for job in self.jobs for f in job.output} + { + (Path(os.fspath(f)).absolute(), job) + for job in self.jobs + for f in job.output + if is_local_file(f) + } ) - for i in range(len(outputs) - 1): - (a, job_a), (b, job_b) = outputs[i : i + 2] - try: - common = os.path.commonpath([a, b]) - except ValueError: - # commonpath raises error if windows drives are different. - continue - if a != b and common == os.path.commonpath([a]) and job_a != job_b: - raise ChildIOException(parent=outputs[i], child=outputs[i + 1]) + for (a, job_a), (b, job_b) in zip(outputs, outputs[1:]): + try: + b.relative_to(a) # b is nested under a (handles drives/separators) + is_child = a != b + except ValueError: + is_child = False + if is_child and job_a != job_b: + raise ChildIOException(parent=(a, job_a), child=(b, job_b))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
docs/snakefiles/rules.rst(1 hunks)src/snakemake/dag.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- docs/snakefiles/rules.rst
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/dag.py
🧬 Code graph analysis (1)
src/snakemake/dag.py (1)
src/snakemake/jobs.py (5)
jobs(1367-1368)jobs(1371-1372)output(345-346)output(349-350)output(1546-1552)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (56)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (10, windows-2022, py312)
- GitHub Check: tests (7, windows-2022, py312)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (8, windows-2022, py312)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (9, windows-2022, py312)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (5, windows-2022, py312)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (6, windows-2022, py312)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (4, windows-2022, py312)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (4, macos-latest, py312)
- GitHub Check: tests (3, windows-2022, py312)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (3, macos-latest, py312)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (2, windows-2022, py312)
- GitHub Check: tests (1, windows-2022, py312)
- GitHub Check: tests (2, macos-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (1, macos-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: apidocs
🔇 Additional comments (1)
src/snakemake/dag.py (1)
300-303: Fix confirmed: Path-based sort resolves Windows ordering bug.Switching to pathlib.Path for sorting avoids the backslash ASCII-order pitfall on Windows. Good change.
🤖 I have created a release *beep* *boop* --- ## [9.11.0](v9.10.1...v9.11.0) (2025-09-05) ### Features * add landing page to the report containing custom metadata defined with a YTE yaml template ([#3452](#3452)) ([c754d80](c754d80)) * Issue a warning when unsupported characters are used in wildcard names ([#1587](#1587)) ([fa57355](fa57355)) ### Bug Fixes * avoid checking output files in immediate-submit mode ([#3569](#3569)) ([58c42cf](58c42cf)) * clarify --keep-going flag help text to distinguish runtime vs DAG construction errors ([#3705](#3705)) ([a3a8ef4](a3a8ef4)) * enable docstring assignment in `use rule ... with:` block ([#3710](#3710)) ([2191962](2191962)) * fix missing attribute error in greedy scheduler settings when using touch, dryrun or immediate submit. ([6471004](6471004)) * only trigger script with CODE label ([#3707](#3707)) ([2d01f8c](2d01f8c)) * parser.py incorrectly parsing multiline f-strings ([#3701](#3701)) ([06ece76](06ece76)) * parsing ordinary yaml strings ([#3506](#3506)) ([ddf334e](ddf334e)) * remove temp files when using checkpoints ([#3716](#3716)) ([5ff3e20](5ff3e20)) * Restore python rules changes triggering reruns. (GH: [#3213](#3213)) ([#3485](#3485)) ([2f663be](2f663be)) * unit testing ([#3680](#3680)) ([06ba7e6](06ba7e6)) * use queue_input_wait_time when updating queue input jobs ([#3712](#3712)) ([a945a19](a945a19)) ### Documentation * output files within output directories is an error ([#2848](#2848)) ([#2913](#2913)) ([37272e5](37272e5)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
) (snakemake#2913) <!--Add a description of your PR here--> Closes snakemake#2848, by providing the documentation suggested, and fixes a bug found while investigating: The `check_directory_outputs()` function relies on the sorting of file paths to detect nested outputs. This breaks on windows as file paths are separated with `\\` rather than `/`, which is sorted after alphanumeric characters, rather than before. I've fixed this, and provided a test case, but as there's not currently a mechanism for distinguishing failures before/during execution (before being the intended behavior, as this provides better feedback to the user), it would pass either way. A better test would be the code below, but I don't want to stray from the use of `run()` in the other test cases. It may also be possible to extend run with an `expected_exception` parameter for non-shell execution. ``` def test_github_issue2848(): # Should fail on parsing with ChildIOException, on Windows previously failed during execution with api.SnakemakeApi() as snakemake_api: workflow_api = snakemake_api.workflow( resource_settings=settings.ResourceSettings(cores=1), snakefile=Path(dpath("test_github_issue2848/Snakefile")), workdir=Path(dpath("test_github_issue2848")), ) dag_api = workflow_api.dag(dag_settings=settings.DAGSettings(targets={"output/c.txt", "output/dir2/d.txt"})) with pytest.raises(ChildIOException): dag_api.execute_workflow() ``` ### QC <!-- Make sure that you can tick the boxes below. --> * [x] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [x] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Documentation • Updated the docs with a sidebar note explaining that directories marked as outputs are removed before a job runs, helping prevent accidental file deletion. - New Features • Added an example workflow that demonstrates file and directory management in job execution. - Tests • Expanded test coverage to ensure improved error handling around workflow output parsing. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Johannes Köster <johannes.koester@uni-due.de> Co-authored-by: Johannes Köster <johannes.koester@tu-dortmund.de> Co-authored-by: Filipe G. Vieira <1151762+fgvieira@users.noreply.github.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
🤖 I have created a release *beep* *boop* --- ## [9.11.0](snakemake/snakemake@v9.10.1...v9.11.0) (2025-09-05) ### Features * add landing page to the report containing custom metadata defined with a YTE yaml template ([snakemake#3452](snakemake#3452)) ([c754d80](snakemake@c754d80)) * Issue a warning when unsupported characters are used in wildcard names ([snakemake#1587](snakemake#1587)) ([fa57355](snakemake@fa57355)) ### Bug Fixes * avoid checking output files in immediate-submit mode ([snakemake#3569](snakemake#3569)) ([58c42cf](snakemake@58c42cf)) * clarify --keep-going flag help text to distinguish runtime vs DAG construction errors ([snakemake#3705](snakemake#3705)) ([a3a8ef4](snakemake@a3a8ef4)) * enable docstring assignment in `use rule ... with:` block ([snakemake#3710](snakemake#3710)) ([2191962](snakemake@2191962)) * fix missing attribute error in greedy scheduler settings when using touch, dryrun or immediate submit. ([6471004](snakemake@6471004)) * only trigger script with CODE label ([snakemake#3707](snakemake#3707)) ([2d01f8c](snakemake@2d01f8c)) * parser.py incorrectly parsing multiline f-strings ([snakemake#3701](snakemake#3701)) ([06ece76](snakemake@06ece76)) * parsing ordinary yaml strings ([snakemake#3506](snakemake#3506)) ([ddf334e](snakemake@ddf334e)) * remove temp files when using checkpoints ([snakemake#3716](snakemake#3716)) ([5ff3e20](snakemake@5ff3e20)) * Restore python rules changes triggering reruns. (GH: [snakemake#3213](snakemake#3213)) ([snakemake#3485](snakemake#3485)) ([2f663be](snakemake@2f663be)) * unit testing ([snakemake#3680](snakemake#3680)) ([06ba7e6](snakemake@06ba7e6)) * use queue_input_wait_time when updating queue input jobs ([snakemake#3712](snakemake#3712)) ([a945a19](snakemake@a945a19)) ### Documentation * output files within output directories is an error ([snakemake#2848](snakemake#2848)) ([snakemake#2913](snakemake#2913)) ([37272e5](snakemake@37272e5)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).




Closes #2848, by providing the documentation suggested, and fixes a bug found while investigating:
The
check_directory_outputs()function relies on the sorting of file paths to detect nested outputs. This breaks on windows as file paths are separated with\\rather than/, which is sorted after alphanumeric characters, rather than before.I've fixed this, and provided a test case, but as there's not currently a mechanism for distinguishing failures before/during execution (before being the intended behavior, as this provides better feedback to the user), it would pass either way. A better test would be the code below, but I don't want to stray from the use of
run()in the other test cases. It may also be possible to extend run with anexpected_exceptionparameter for non-shell execution.QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit