fix: mtime of script or notebook not triggering workflow without metadata#3148
Conversation
|
Warning Rate limit exceeded@johanneskoester has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 1 minutes and 2 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughThe pull request introduces modifications to the Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake/dag.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/dag.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
🔇 Additional comments (4)
snakemake/dag.py (4)
1306-1307: Properly handle outdated metadata checkThe added condition correctly checks if the job has outdated metadata using
self.workflow.persistence.has_outdated_metadata(job), and appropriately setsreason.outdated_metadatatoTrue.
1308-1311: Conditionally assignreason.params_changedwithout unnecessary wrappingThe code correctly assigns
reason.params_changedbased on whetherRerunTrigger.PARAMSis inself.workflow.rerun_triggers. Ensure thatself.workflow.persistence.params_changed(job)returns a boolean value.
1317-1327: Efficient handling of async generator inreason.code_changedThe use of a list comprehension to collect items from the async generator
job.outputs_older_than_script_or_notebook()before applyingany()is appropriate. This ensures all items are evaluated, and the code change correctly handles the asynchronous nature of the generator.
1313-1315: 🛠️ Refactor suggestionSimplify boolean assignment for
reason.input_changedIf
self.workflow.persistence.input_changed(job)returns a single boolean value rather than an iterable, wrapping it withany()is unnecessary and may cause confusion. Consider simplifying the code:-if RerunTrigger.INPUT in self.workflow.rerun_triggers: - reason.input_changed = any( - self.workflow.persistence.input_changed(job) - ) +if RerunTrigger.INPUT in self.workflow.rerun_triggers: + reason.input_changed = self.workflow.persistence.input_changed(job)Likely invalid or redundant comment.
|
🤖 I have created a release *beep* *boop* --- ## [8.26.0](v8.25.5...v8.26.0) (2024-12-23) ### Features * add helpers for deferred input/output etc. item access ([#2927](#2927)) ([2cca9bc](2cca9bc)) ### Bug Fixes * convert parameters so they can be serialized ([#2925](#2925)) ([9e653fb](9e653fb)) * correct formatting of R preamble ([#2425](#2425)) ([5380cae](5380cae)) * fix modification checks for scripts and and notebooks containing wildcards or params in their paths ([#2751](#2751)) ([773568d](773568d)) * Improved handling of missing output files in group job postprocessing, accounting for temporary files. ([#1765](#1765)) ([bac06ba](bac06ba)) * mtime of script or notebook not triggering workflow without metadata ([#3148](#3148)) ([e8a0b83](e8a0b83)) * Pass `host` attribute to `GitlabFile` instantiation within class methods ([#3155](#3155)) ([9ef52de](9ef52de)) * problem with spaces in path ([#3236](#3236)) ([2d08c63](2d08c63)) * require current yte release which contains an important bug fix for cases where numpy/pandas data is passed to templates ([#3227](#3227)) ([c3339da](c3339da)) * rerun jobs if previously failed but rule was changed afterwards (thanks to [@laf070810](https://github.com/laf070810) for bringing this up) ([#3237](#3237)) ([1dc0084](1dc0084)) * use relpath for configfiles added to the source archive (thanks to [@sposadac](https://github.com/sposadac) for the initial solution) ([#3240](#3240)) ([bff3844](bff3844)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This PR is inspired by #3706, resolves issues #3456 and #3474, and addresses the concern raised in #3148 (comment) In brief, commit e502312 made changes so that the modification time (mtime) of script and notebook files no longer triggers a rerun of the workflow, later, commit e8a0b83 reversed this, making mtime changes always trigger reruns -- even when RerunTrigger.CODE is not specified in `--rerun-triggers`. This created an undesirable situation: legitimate and common scenarios (e.g., copying scripts or pulling them from a Git repository) could unnecessarily trigger reruns — as reported in #3474. To make the behavior more intuitive, the `outputs_older_than_script_or_notebook` condition should take precedence over metadata checks. However, this condition should still be governed by the presence of `RerunTrigger.CODE`. ### QC <!-- Make sure that you can tick the boxes below. --> * [x] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [ ] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Code-change reruns now only run when the CODE rerun trigger is enabled; code-change detection correctly accumulates signals from multiple sources to avoid missed or spurious reruns. * **Tests** * Added an MTIME-triggered rerun test validating that outputs retain timestamps when only input mtimes change. * **Chores** * Fixed a test import path and broadened connectivity error handling to include timeouts. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…data (snakemake#3148) <!--Add a description of your PR here--> See snakemake#3014 (review) ### QC <!-- Make sure that you can tick the boxes below. --> * [ ] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [ ] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Bug Fixes** - Improved logic for determining job execution conditions, ensuring accurate evaluations. - **Refactor** - Streamlined the `update_needrun` method for better readability and maintainability. - Consolidated checks for job metadata and conditions to eliminate redundancy. - **Style** - Minor adjustments to comments and formatting for enhanced clarity. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Anfeng Li <anfeng.li@cern.ch> Co-authored-by: Johannes Köster <johannes.koester@tu-dortmund.de>
🤖 I have created a release *beep* *boop* --- ## [8.26.0](snakemake/snakemake@v8.25.5...v8.26.0) (2024-12-23) ### Features * add helpers for deferred input/output etc. item access ([snakemake#2927](snakemake#2927)) ([2cca9bc](snakemake@2cca9bc)) ### Bug Fixes * convert parameters so they can be serialized ([snakemake#2925](snakemake#2925)) ([9e653fb](snakemake@9e653fb)) * correct formatting of R preamble ([snakemake#2425](snakemake#2425)) ([5380cae](snakemake@5380cae)) * fix modification checks for scripts and and notebooks containing wildcards or params in their paths ([snakemake#2751](snakemake#2751)) ([773568d](snakemake@773568d)) * Improved handling of missing output files in group job postprocessing, accounting for temporary files. ([snakemake#1765](snakemake#1765)) ([bac06ba](snakemake@bac06ba)) * mtime of script or notebook not triggering workflow without metadata ([snakemake#3148](snakemake#3148)) ([e8a0b83](snakemake@e8a0b83)) * Pass `host` attribute to `GitlabFile` instantiation within class methods ([snakemake#3155](snakemake#3155)) ([9ef52de](snakemake@9ef52de)) * problem with spaces in path ([snakemake#3236](snakemake#3236)) ([2d08c63](snakemake@2d08c63)) * require current yte release which contains an important bug fix for cases where numpy/pandas data is passed to templates ([snakemake#3227](snakemake#3227)) ([c3339da](snakemake@c3339da)) * rerun jobs if previously failed but rule was changed afterwards (thanks to [@laf070810](https://github.com/laf070810) for bringing this up) ([snakemake#3237](snakemake#3237)) ([1dc0084](snakemake@1dc0084)) * use relpath for configfiles added to the source archive (thanks to [@sposadac](https://github.com/sposadac) for the initial solution) ([snakemake#3240](snakemake#3240)) ([bff3844](snakemake@bff3844)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This PR is inspired by snakemake#3706, resolves issues snakemake#3456 and snakemake#3474, and addresses the concern raised in snakemake#3148 (comment) In brief, commit e502312 made changes so that the modification time (mtime) of script and notebook files no longer triggers a rerun of the workflow, later, commit e8a0b83 reversed this, making mtime changes always trigger reruns -- even when RerunTrigger.CODE is not specified in `--rerun-triggers`. This created an undesirable situation: legitimate and common scenarios (e.g., copying scripts or pulling them from a Git repository) could unnecessarily trigger reruns — as reported in snakemake#3474. To make the behavior more intuitive, the `outputs_older_than_script_or_notebook` condition should take precedence over metadata checks. However, this condition should still be governed by the presence of `RerunTrigger.CODE`. ### QC <!-- Make sure that you can tick the boxes below. --> * [x] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [ ] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Code-change reruns now only run when the CODE rerun trigger is enabled; code-change detection correctly accumulates signals from multiple sources to avoid missed or spurious reruns. * **Tests** * Added an MTIME-triggered rerun test validating that outputs retain timestamps when only input mtimes change. * **Chores** * Fixed a test import path and broadened connectivity error handling to include timeouts. <!-- end of auto-generated comment: release notes by coderabbit.ai -->



See #3014 (review)
QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
update_needrunmethod for better readability and maintainability.