fix: consider checkpoint jobs during update_needrun#4092
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughA single conditional in Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can customize the high-level summary generated by CodeRabbit.Configure the |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/tests.py`:
- Around line 1632-1648: The test failure is due to a missing test fixture
directory referenced by test_checkpoint_missing_output via
dpath("test_checkpoint_missing_output"); add a new
tests/test_checkpoint_missing_output/ directory to the repo containing the
Snakefile and any input files and expected output files that the test relies on
(ensure the expected output file "output/test_4.txt" is produced by the
Snakefile so the test can remove and verify regeneration), and include any
helper files used by run(...) so the test can execute deterministically.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8e01e412-40bd-4966-8a8e-52f503addfe0
📒 Files selected for processing (2)
src/snakemake/dag.pytests/tests.py
| def test_checkpoint_missing_output(): | ||
| """test for issue 3879""" | ||
| test_dir = dpath("test_checkpoint_missing_output") | ||
| tmpdir = run(test_dir, cleanup=False, check_results=False) | ||
|
|
||
| try: | ||
| missing_file = Path(tmpdir) / "output" / "test_4.txt" | ||
| os.remove(missing_file) | ||
|
|
||
| run(test_dir, tmpdir=tmpdir, cleanup=False, check_results=False) | ||
|
|
||
| assert ( | ||
| missing_file.exists() | ||
| ), "The missing checkpoint output was not regenerated." | ||
| finally: | ||
| shutil.rmtree(tmpdir, ignore_errors=ON_WINDOWS) | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the test directory exists with required files
echo "=== Checking test directory ==="
fd -t d "test_checkpoint_missing_output" tests/
echo "=== Listing contents if directory exists ==="
if [ -d "tests/test_checkpoint_missing_output" ]; then
tree tests/test_checkpoint_missing_output/ 2>/dev/null || ls -laR tests/test_checkpoint_missing_output/
else
echo "ERROR: Test directory tests/test_checkpoint_missing_output/ not found"
fiRepository: snakemake/snakemake
Length of output: 209
Add the test directory tests/test_checkpoint_missing_output/ to the repository.
The test calls dpath("test_checkpoint_missing_output"), which requires tests/test_checkpoint_missing_output/ to exist with a Snakefile and expected outputs. This directory is missing from the PR and must be included for the test to run.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/tests.py` around lines 1632 - 1648, The test failure is due to a
missing test fixture directory referenced by test_checkpoint_missing_output via
dpath("test_checkpoint_missing_output"); add a new
tests/test_checkpoint_missing_output/ directory to the repo containing the
Snakefile and any input files and expected output files that the test relies on
(ensure the expected output file "output/test_4.txt" is produced by the
Snakefile so the test can remove and verify regeneration), and include any
helper files used by run(...) so the test can execute deterministically.
There was a problem hiding this comment.
looks like a reasonable solution to me.
edit: i re-read the referenced issue. i am unsure about the solution here now. how can we know if (and which) outputs of a checkpoint are missing? i guess we assume that checkpoints are determinstic each run? in any case im confused
actually upon re-reading the issue i'm unsure if a solution is even possible?
|
After discussion w/ @johanneskoester this may turn out to be problematic when a checkpoint generates temporary files. While this fix would be the only way to notice this, this would also force re-runs for checkpoints with temp outputs, even though technically no changes that'd change the outcome have occurred. |
johanneskoester
left a comment
There was a problem hiding this comment.
First decision after live discussion: we think that it rather should be documented that intermediate jobs between the checkpoint and the input function can not be triggered for reruns if the checkpoint output files are missing. Avoiding that would mean to cause reruns of the checkpoint and everything downstream if the checkpoint output is missing (e.g. because of being marked as temp). Workaround: use -R <checkpoint_name> in such cases.
| if is_forced(job): | ||
| reason.forced = True | ||
| elif job in self.targetjobs: | ||
| elif job in self.targetjobs or job.is_checkpoint: |
There was a problem hiding this comment.
I don't think this is the accurate place to fix this.
Marking every checkpoint as needrun unconditionally here has a broader effect than intended: it forces re-evaluation of checkpoints whose outputs already exist and are up to date, which can cause unnecessary re-runs in incremental workflows.
This is a similar case to #3009, and the fix should be in update_checkpoint_dependencies.
…sts (#4124) will fix #3879. See also the discussion in #4092 ### QC <!-- Make sure that you can tick the boxes below. --> * [x] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [x] This is not necessary to update the documentation (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake). * [x] I, as a human being, have checked each line of code in this pull request ### AI-assistance disclosure <!-- If AI tools were involved in creating this PR, please check all boxes that apply below and make sure that you adhere to our AI-assisted contributions policy: https://github.com/snakemake/snakemake/blob/main/docs/project_info/contributing.rst --> I used AI assistance for: * [x] Code generation (e.g., when writing an implementation or fixing a bug) * [ ] Test/benchmark generation * [ ] Documentation (including examples) * [ ] Research and understanding <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Checkpoint handling now skips instances whose outputs are missing, preventing failures when previously created checkpoint outputs are deleted and allowing workflows to continue. * **Tests** * Added an integration test verifying reruns succeed when checkpoint outputs are removed after creation. * Removed an obsolete checkpoint-related test. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/snakemake/snakemake/pull/4124?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Till Hartmann <till.hartmann@bih-charite.de> Co-authored-by: Johannes Köster <johannes.koester@tu-dortmund.de>
|
Closing in favor of the merged #4124. Nevertheless, thanks for bringing this up and helping to find the correct solution! |
Resolves #3879
Not sure this is the correct solution, though (might have unforeseeable side consequences, though I believe this should be fine, may just cause some more I/O overhead for checking the checkpoint's dependencies).
QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
Bug Fixes
Tests