feat: show info on missing metadata#3014
Conversation
WalkthroughThe updates to Snakemake enhance job metadata handling and logging capabilities. A new function, Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 8
Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Files selected for processing (5)
- snakemake/common/init.py (2 hunks)
- snakemake/dag.py (1 hunks)
- snakemake/jobs.py (2 hunks)
- snakemake/persistence.py (1 hunks)
- snakemake/workflow.py (4 hunks)
Additional context used
Ruff
snakemake/common/__init__.py
354-354: Missing type annotation for function argument
jobs(ANN001)
355-355: Unnecessary generator (rewrite as a
setcomprehension)Rewrite as a
setcomprehension(C401)
snakemake/persistence.py
375-375: Missing return type annotation for public function
has_metadata(ANN201)
375-375: Missing type annotation for
selfin method(ANN101)
375-375: Missing type annotation for function argument
job(ANN001)
snakemake/workflow.py
1147-1147: Missing return type annotation for private function
log_missing_metadata_infoAdd return type annotation:
None(ANN202)
1156-1156: Trailing comma missing
Add trailing comma
(COM812)
1247-1247: Trailing comma missing
Add trailing comma
(COM812)
snakemake/dag.py
1266-1266: Trailing comma missing
Add trailing comma
(COM812)
1270-1270: Trailing comma missing
Add trailing comma
(COM812)
1274-1277: Unnecessary list comprehension
Remove unnecessary comprehension
(C419)
1277-1277: Trailing comma missing
Add trailing comma
(COM812)
1279-1279: Trailing comma missing
Add trailing comma
(COM812)
1286-1286: Trailing comma missing
Add trailing comma
(COM812)
1288-1288: Trailing comma missing
Add trailing comma
(COM812)
Additional comments not posted (7)
snakemake/jobs.py (1)
Line range hint
1638-1659: LGTM! The addition ofno_metadatais consistent.The new attribute
no_metadatain theReasonclass is initialized toFalse. This change is consistent with the class's purpose and enhances its functionality.Tools
Ruff
1641-1641: Missing return type annotation for special method
__init__Add return type annotation:
None(ANN204)
1641-1641: Missing type annotation for
selfin method(ANN101)
snakemake/workflow.py (2)
1247-1247: Add a trailing comma in the logger message.The logger message in
log_provenance_infois missing a trailing comma for better formatting.logger.info( "Rules with provenance triggered jobs: " - + " ".join(jobs_to_rulenames(provenance_triggered_jobs)) + + " ".join(jobs_to_rulenames(provenance_triggered_jobs)), )Likely invalid or redundant comment.
Tools
Ruff
1247-1247: Trailing comma missing
Add trailing comma
(COM812)
1218-1219: Integration oflog_missing_metadata_infolooks good.The call to
log_missing_metadata_infois appropriately placed to log missing metadata when no jobs are triggered.Ensure that this function is invoked correctly in all necessary scenarios.
Verification successful
Integration of
log_missing_metadata_infois correct and localized.The function
log_missing_metadata_infois defined and used withinsnakemake/workflow.py, indicating proper integration without external dependencies.
- File:
snakemake/workflow.pyScripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the usage of `log_missing_metadata_info` in the codebase. # Test: Search for the function usage. Expect: Only occurrences of the correct usage. rg --type python $'log_missing_metadata_info'Length of output: 75
Script:
#!/bin/bash # Description: Verify the usage of `log_missing_metadata_info` in the codebase. # Test: Search for the function usage. Expect: Only occurrences of the correct usage. rg --type py $'log_missing_metadata_info'Length of output: 177
Tools
Ruff
1219-1219: Do not implicitly
return Nonein function able to return non-NonevalueAdd explicit
Nonereturn value(RET502)
snakemake/dag.py (4)
1261-1262: Setreason.no_metadataearlier.The
reason.no_metadataflag is set immediately when metadata is absent, which is a good practice to avoid unnecessary checks. This change enhances the control flow and logic of the function.
1269-1270: Add a trailing comma ininput_changed.A trailing comma is missing in the list comprehension for
input_changed. Adding it improves code consistency and readability.- self.workflow.persistence.input_changed(job) + self.workflow.persistence.input_changed(job),Likely invalid or redundant comment.
Tools
Ruff
1270-1270: Trailing comma missing
Add trailing comma
(COM812)
1285-1286: Add a trailing comma inconda_env_changed.A trailing comma is missing in the list comprehension for
conda_env_changed. Adding it improves code consistency and readability.- self.workflow.persistence.conda_env_changed(job) + self.workflow.persistence.conda_env_changed(job),Likely invalid or redundant comment.
Tools
Ruff
1286-1286: Trailing comma missing
Add trailing comma
(COM812)
1279-1279: Add a trailing comma incode_changed.A trailing comma is missing in the list comprehension for
code_changed. Adding it improves code consistency and readability.- self.workflow.persistence.code_changed(job) + self.workflow.persistence.code_changed(job),Likely invalid or redundant comment.
Tools
Ruff
1279-1279: Trailing comma missing
Add trailing comma
(COM812)
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 2
Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Files selected for processing (1)
- snakemake/workflow.py (4 hunks)
Additional context used
Ruff
snakemake/workflow.py
1147-1147: Missing return type annotation for private function
log_missing_metadata_infoAdd return type annotation:
None(ANN202)
1158-1158: Trailing comma missing
Add trailing comma
(COM812)
1249-1249: Trailing comma missing
Add trailing comma
(COM812)
Additional comments not posted (2)
snakemake/workflow.py (2)
1220-1221: Integration oflog_missing_metadata_infoin dry run logic looks good.The addition of
log_missing_metadata_infoduring a dry run enhances the visibility of missing metadata. The integration appears correct.Tools
Ruff
1221-1221: Do not implicitly
return Nonein function able to return non-NonevalueAdd explicit
Nonereturn value(RET502)
1158-1159: Add a trailing comma for better formatting.A trailing comma is missing in the logger message for better formatting and to prevent potential issues when extending the list.
- f"metadata: {' '.join(jobs_to_rulenames(no_metadata_jobs))}" + f"metadata: {' '.join(jobs_to_rulenames(no_metadata_jobs))}",Likely invalid or redundant comment.
Tools
Ruff
1158-1158: Trailing comma missing
Add trailing comma
(COM812)
…akemake/snakemake into fix/notify-of-missing-provenance-info
|
Please format your code with black: |
|
There was a problem hiding this comment.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Files selected for processing (4)
- snakemake/dag.py (1 hunks)
- snakemake/jobs.py (4 hunks)
- snakemake/persistence.py (2 hunks)
- snakemake/workflow.py (4 hunks)
Additional context used
Path-based instructions (4)
snakemake/persistence.py (1)
Pattern
**/*.py: Do not suggest to add trailing commas.
Do not suggest type annotations for functions that are defined inside of functions or methods.snakemake/jobs.py (1)
Pattern
**/*.py: Do not suggest to add trailing commas.
Do not suggest type annotations for functions that are defined inside of functions or methods.snakemake/workflow.py (1)
Pattern
**/*.py: Do not suggest to add trailing commas.
Do not suggest type annotations for functions that are defined inside of functions or methods.snakemake/dag.py (1)
Pattern
**/*.py: Do not suggest to add trailing commas.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Learnings (2)
snakemake/workflow.py (2)
Learnt from: johanneskoester PR: snakemake/snakemake#3014 File: snakemake/workflow.py:1250-1251 Timestamp: 2024-08-13T16:22:58.430Z Learning: Avoid commenting on trailing commas in code reviews for the Snakemake repository as per the user's preference.Learnt from: johanneskoester PR: snakemake/snakemake#3014 File: snakemake/workflow.py:1147-1147 Timestamp: 2024-08-13T16:22:09.641Z Learning: Avoid suggesting type annotations for functions that are inside methods in the Snakemake codebase.snakemake/dag.py (3)
Learnt from: johanneskoester PR: snakemake/snakemake#3014 File: snakemake/dag.py:1265-1266 Timestamp: 2024-08-13T16:09:09.726Z Learning: For the Snakemake project, avoid suggesting trailing commas as they use Black as an autoformatter.Learnt from: johanneskoester PR: snakemake/snakemake#3014 File: snakemake/workflow.py:1250-1251 Timestamp: 2024-08-13T16:22:58.430Z Learning: Avoid commenting on trailing commas in code reviews for the Snakemake repository as per the user's preference.Learnt from: johanneskoester PR: snakemake/snakemake#3014 File: snakemake/workflow.py:0-0 Timestamp: 2024-08-13T16:07:33.369Z Learning: In the Snakemake project, trailing commas are not needed in logger messages if Black formatting is used, as it doesn't require them in such contexts.
Ruff
snakemake/persistence.py
375-375: Missing type annotation for
selfin method(ANN101)
snakemake/workflow.py
1147-1147: Missing return type annotation for private function
log_missing_metadata_infoAdd return type annotation:
None(ANN202)
1159-1159: Trailing comma missing
Add trailing comma
(COM812)
1250-1250: Trailing comma missing
Add trailing comma
(COM812)
snakemake/dag.py
1266-1266: Trailing comma missing
Add trailing comma
(COM812)
1270-1270: Trailing comma missing
Add trailing comma
(COM812)
1277-1280: Unnecessary list comprehension
Remove unnecessary comprehension
(C419)
1280-1280: Trailing comma missing
Add trailing comma
(COM812)
1282-1282: Trailing comma missing
Add trailing comma
(COM812)
1289-1289: Trailing comma missing
Add trailing comma
(COM812)
1291-1291: Trailing comma missing
Add trailing comma
(COM812)
Additional comments not posted (11)
snakemake/persistence.py (1)
375-376: Add type annotations forhas_metadatamethod.The method
has_metadatais missing type annotations for itsselfparameter. Consider adding these annotations for better clarity.- def has_metadata(self, job: Job) -> bool: + def has_metadata(self: 'Persistence', job: Job) -> bool:Tools
Ruff
375-375: Missing type annotation for
selfin method(ANN101)
snakemake/jobs.py (2)
Line range hint
1638-1659: Initializeno_metadataattribute inReasonclass.The
no_metadataattribute is initialized toFalsein theReasonclass. This addition allows the class to manage scenarios where metadata is missing. Ensure that this attribute is used consistently across the codebase to handle such cases.Tools
Ruff
1641-1641: Missing return type annotation for special method
__init__Add return type annotation:
None(ANN204)
1641-1641: Missing type annotation for
selfin method(ANN101)
1805-1806: Functionjobs_to_rulenamesis well-defined.The function
jobs_to_rulenamescorrectly extracts and returns a sorted list of unique rule names from an iterable ofJobobjects. This utility function enhances job handling by organizing rule names effectively.snakemake/workflow.py (3)
1221-1222: Integration oflog_missing_metadata_infois appropriate.The call to
log_missing_metadata_infoduring a dry run effectively enhances the visibility of jobs missing metadata.Tools
Ruff
1222-1222: Do not implicitly
return Nonein function able to return non-NonevalueAdd explicit
Nonereturn value(RET502)
1250-1251: LGTM! The functionlog_provenance_infois implemented correctly.The use of
jobs_to_rulenamesenhances readability and aligns with the objective of improving logging clarity.Tools
Ruff
1250-1250: Trailing comma missing
Add trailing comma
(COM812)
1147-1160: LGTM! Verify the function usage in the codebase.The function
log_missing_metadata_infois implemented correctly and aligns with the PR objectives to improve metadata visibility.Ensure that this function is called in the appropriate contexts to achieve its intended purpose.
Verification successful
Let's correct the file type and re-run the search to verify the usage of the
log_missing_metadata_infofunction.
Function
log_missing_metadata_infois correctly utilized within the codebase.The function is called within
snakemake/workflow.py, ensuring that jobs without metadata are logged as intended.
- File:
snakemake/workflow.py- Function Call: Verified within the same file.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify all function calls to `log_missing_metadata_info`. # Test: Search for the function usage. Expect: Occurrences of the function being called in appropriate contexts. rg --type python -A 5 $'log_missing_metadata_info'Length of output: 80
Script:
#!/bin/bash # Description: Verify all function calls to `log_missing_metadata_info`. # Test: Search for the function usage. Expect: Occurrences of the function being called in appropriate contexts. rg --type py -A 5 $'log_missing_metadata_info'Length of output: 799
Tools
Ruff
1147-1147: Missing return type annotation for private function
log_missing_metadata_infoAdd return type annotation:
None(ANN202)
1159-1159: Trailing comma missing
Add trailing comma
(COM812)
snakemake/dag.py (5)
1261-1262: LGTM: Metadata check added.The addition of the metadata check before further evaluations is a logical improvement. This prevents unnecessary checks when metadata is absent.
1264-1267: LGTM: Conditional parameter change check.The code correctly checks for parameter changes only if the corresponding rerun trigger is set. This ensures evaluations are performed only when necessary.
Tools
Ruff
1266-1266: Trailing comma missing
Add trailing comma
(COM812)
1269-1271: LGTM: Conditional input change check.The code follows the pattern of checking for input changes only if the rerun trigger is set, maintaining consistency and efficiency.
Tools
Ruff
1270-1270: Trailing comma missing
Add trailing comma
(COM812)
1276-1283: LGTM: Code change check with list comprehension.The list comprehension is necessary to collect async generator items before applying
any(). This ensures proper evaluation of all items.Tools
Ruff
1277-1280: Unnecessary list comprehension
Remove unnecessary comprehension
(C419)
1280-1280: Trailing comma missing
Add trailing comma
(COM812)
1282-1282: Trailing comma missing
Add trailing comma
(COM812)
1288-1292: LGTM: Conditional software environment change check.The code correctly checks for changes in the software environment only if the rerun trigger is set, maintaining consistency with other checks.
Tools
Ruff
1289-1289: Trailing comma missing
Add trailing comma
(COM812)
1291-1291: Trailing comma missing
Add trailing comma
(COM812)
🤖 I have created a release *beep* *boop* --- ## [8.18.0](v8.17.0...v8.18.0) (2024-08-14) ### Features * show info on missing metadata ([#3014](#3014)) ([e502312](e502312)) ### Performance Improvements * cache mtime of scripts and notebooks ([#2965](#2965)) ([405a056](405a056)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced detailed reporting for missing metadata during workflow execution, enhancing visibility and debugging. - **Performance Improvements** - Implemented caching for script and notebook modification times, improving workflow execution efficiency. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
| if not self.workflow.persistence.has_metadata(job): | ||
| reason.no_metadata = True | ||
| else: |
There was a problem hiding this comment.
This change causes a bug: when using --drop-metadata (for filesystem speed), the mtime of script/notebook no longer triggers the workflow. I think we should handle this case, or make no_metadata a choice of RerunTrigger.
…data (#3148) <!--Add a description of your PR here--> See #3014 (review) ### QC <!-- Make sure that you can tick the boxes below. --> * [ ] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [ ] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Bug Fixes** - Improved logic for determining job execution conditions, ensuring accurate evaluations. - **Refactor** - Streamlined the `update_needrun` method for better readability and maintainability. - Consolidated checks for job metadata and conditions to eliminate redundancy. - **Style** - Minor adjustments to comments and formatting for enhanced clarity. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Anfeng Li <anfeng.li@cern.ch> Co-authored-by: Johannes Köster <johannes.koester@tu-dortmund.de>
…data (snakemake#3148) <!--Add a description of your PR here--> See snakemake#3014 (review) ### QC <!-- Make sure that you can tick the boxes below. --> * [ ] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [ ] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Bug Fixes** - Improved logic for determining job execution conditions, ensuring accurate evaluations. - **Refactor** - Streamlined the `update_needrun` method for better readability and maintainability. - Consolidated checks for job metadata and conditions to eliminate redundancy. - **Style** - Minor adjustments to comments and formatting for enhanced clarity. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Anfeng Li <anfeng.li@cern.ch> Co-authored-by: Johannes Köster <johannes.koester@tu-dortmund.de>



In dryrun, if there is no metadata recorded for certain jobs, we now print an info.
QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
New Features
Bug Fixes
Chores