feat: more robust parameter and code change detection and transparent reporting of detected changes#3140
Conversation
…g of detected changes
WalkthroughThe changes made in the Snakemake framework involve enhancements to job parameter tracking and management across several files: Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
snakemake/persistence.py (1)
444-454: Update the docstring forparams_changedmethodThe
params_changedmethod now returns aParamsChangeobject instead of a boolean. Consider updating the method's docstring to reflect this change and provide guidance on how to interpret the returned value.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
- snakemake/dag.py (1 hunks)
- snakemake/jobs.py (2 hunks)
- snakemake/persistence.py (5 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
snakemake/dag.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.snakemake/jobs.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.snakemake/persistence.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
🔇 Additional comments (9)
snakemake/jobs.py (4)
1648-1649: LGTM: Import of NO_PARAMS_CHANGEThe import of
NO_PARAMS_CHANGEfromsnakemake.persistenceis a good addition. It suggests that the parameter change tracking has been refined to provide more detailed information.
1655-1655: Improved parameter change trackingThe initialization of
self.params_changedwithNO_PARAMS_CHANGEinstead of a boolean value indicates a more nuanced approach to tracking parameter changes. This change allows for more detailed reporting of parameter modifications.
1783-1785: Enhanced parameter change reportingThe
__str__method now includes the actual value ofself.params_changedin the output string. This provides more detailed information about the nature of parameter changes, improving the clarity of the reason for job re-execution.
Line range hint
1648-1785: Overall assessment: Improved parameter change handlingThe changes in the
Reasonclass enhance Snakemake's ability to track and report parameter changes more precisely. This improvement allows for more transparent reporting of detected changes, aligning well with the pull request's objectives. The modifications are well-implemented and do not introduce any apparent issues or inconsistencies.snakemake/persistence.py (4)
7-7: Importingdataclassis necessaryThe addition of
dataclassfrom thedataclassesmodule is appropriate for defining theParamsChangeclass later in the code.
20-20: Added typing imports for type annotationsThe inclusion of
Any,Optional, andSetfrom thetypingmodule enhances type annotations and code readability.
26-26: ImportingTBDStringfor parameter handlingThe import of
TBDStringis necessary for handling unrepresentable parameters in the_serialize_param_builtinmethod.
515-535: Confirm the comparison withTBDStringIn the
_serialize_param_builtinmethod, you checkparam is not TBDStringto excludeTBDStringparameters. Ensure that this comparison correctly excludes all instances ofTBDString. IfTBDStringcan have multiple instances, consider usingnot isinstance(param, TBDString)instead.snakemake/dag.py (1)
1307-1307: Direct assignment toreason.params_changedis appropriateThe direct assignment of
reason.params_changedimproves code clarity and is appropriate ifself.workflow.persistence.params_changed(job)returns a boolean value.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
snakemake/persistence.py (1)
444-454: LGTM: Improved parameter change trackingThe refactoring of the
params_changedmethod significantly improves its clarity and functionality. The use of theParamsChangeobject provides a more structured way to track parameter changes across job executions.Consider adding a comment explaining the purpose of the
NO_PARAMS_CHANGEconstant and its usage here. This would enhance the readability of the code.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake/persistence.py (5 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/persistence.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
🔇 Additional comments (4)
snakemake/persistence.py (4)
515-535: LGTM: Improved parameter serialization logicThe addition of
TBDStringto the condition in_serialize_param_builtinmethod ensures that it's treated as an unrepresentable parameter. This change aligns well with the new parameter serialization logic and improves the robustness of the method.
725-725: LGTM: Useful constant for no parameter changesThe addition of the
NO_PARAMS_CHANGEconstant provides a convenient and clear way to represent the absence of parameter changes. It's effectively used in theparams_changedmethod as an initial value.
7-7: LGTM: Necessary imports addedThe new imports (
dataclass,field,Any,Optional,Set, andTBDString) have been correctly added to support the newParamsChangedataclass, type hinting, and the updated_serialize_param_builtinmethod. These additions enhance the code's type safety and functionality.Also applies to: 20-20, 26-26
Line range hint
1-725: Overall: Excellent enhancements to parameter change tracking and persistenceThe changes in this file significantly improve the parameter change tracking and serialization logic in the Snakemake persistence module. The introduction of the
ParamsChangedataclass and the refactoring of theparams_changedmethod provide a more structured and robust approach to handling parameter changes across job executions.Key improvements:
- Enhanced parameter serialization with
TBDStringhandling- Refactored
params_changedmethod for better clarity and functionality- Well-designed
ParamsChangedataclass for encapsulating parameter changes- Addition of the
NO_PARAMS_CHANGEconstant for convenienceThese changes align well with the PR objective of enhancing the detection of parameter changes and ensuring more transparent reporting.
To ensure that the changes are consistent across the codebase, please run the following script to check for any other occurrences of
params_changedthat might need updating:This will help verify that the new
ParamsChangereturn value is handled correctly throughout the codebase.✅ Verification successful
The occurrences of
params_changedinsnakemake/dag.pyare consistent with the recent changes inpersistence.py. Both instances correctly utilize the updated method signature:
self.workflow.persistence.params_changed(job)– Uses the requiredjobparameter.elif self.workflow.persistence.params_changed(job, file=f):– Appropriately provides bothjoband the optionalfileparameter.These usages align with the enhancements made to the parameter change tracking and serialization logic, ensuring robust and clear handling of parameter changes across the codebase.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for other occurrences of params_changed that might need updating # Search for params_changed method calls rg "params_changed\(" --type python -g '!persistence.py'Length of output: 89
Script:
#!/bin/bash # Description: Check for other occurrences of params_changed that might need updating # Search for params_changed method calls in Python files excluding persistence.py rg "params_changed\(" --glob "*.py" --glob "!snakemake/persistence.py"Length of output: 268
Script:
#!/bin/bash # Description: Extract surrounding context for params_changed calls in snakemake/dag.py # Extract lines around the params_changed calls for better context rg "params_changed\(" --context 3 snakemake/dag.pyLength of output: 894
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
Please format your code with black: |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (2)
snakemake/dag.py (2)
1309-1309: Consider evaluating parameter changes regardless of outdated metadataCurrently, parameter changes are checked only after verifying if the metadata is outdated. If parameters can change independently of metadata status, consider checking for parameter changes regardless of whether the metadata is outdated to ensure all relevant changes are detected.
Line range hint
1306-1317: Review the logical flow for rerun triggersThe code checks for outdated metadata first and then evaluates other rerun triggers like
PARAMS,INPUT,CODE, andSOFTWARE_ENV. Ensure that this sequence correctly handles all scenarios where a job needs to be rerun, especially if some triggers might require action even when metadata is up-to-date.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
- snakemake/dag.py (1 hunks)
- snakemake/jobs.py (3 hunks)
- snakemake/persistence.py (11 hunks)
- snakemake/workflow.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- snakemake/jobs.py
🧰 Additional context used
📓 Path-based instructions (3)
snakemake/dag.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.snakemake/persistence.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.snakemake/workflow.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
🔇 Additional comments (13)
snakemake/persistence.py (11)
7-7: Imports Added CorrectlyThe imports for
dataclass,field, and the typing annotations (Any,Optional,Set) are appropriately added to support the new code changes.Also applies to: 20-20
26-26: ImportTBDStringfor Parameter SerializationImporting
TBDStringensures that parameters of this type are properly handled during serialization, preventing potential issues with unrepresentable objects.
36-36: DefineRECORD_FORMAT_VERSIONConstantDefining the
RECORD_FORMAT_VERSIONas3establishes a clear versioning system for metadata records, facilitating future compatibility checks and migrations.
329-329: Includerecord_format_versionin MetadataAdding
record_format_versionto the metadata ensures that each record carries version information, which is essential for detecting outdated metadata and maintaining data integrity.
400-405: Addhas_outdated_metadataMethodThe
has_outdated_metadatamethod correctly checks if any output files have metadata with a version older than the currentRECORD_FORMAT_VERSION. This function enhances the system's ability to detect and handle outdated records.
427-429: Addrecord_format_versionMethodThe
record_format_versionmethod provides a convenient way to retrieve the metadata's format version for a given path, improving code readability and maintainability.
455-465: Refactorparams_changedMethod to UseParamsChangeThe
params_changedmethod now utilizes theParamsChangedataclass to accumulate parameter differences across output files. This refactoring enhances the clarity and robustness of parameter change detection.
477-480: Adjust_code_changedMethod for Version CompatibilityBy checking the
record_format_version, the_code_changedmethod appropriately handles cases where reliable code comparisons are not possible due to version differences. This prevents false positives in code change detection.
508-512: Modify_codeMethod to Consider Only Shell CommandsFocusing on shell commands in the
_codemethod is a sensible approach, as it avoids issues with pickling Python code across different versions. This change enhances the reliability of code tracking.
533-553: Update_serialize_param_builtinto HandleTBDStringBy treating
TBDStringas unrepresentable, the_serialize_param_builtinmethod prevents potential serialization issues, ensuring that incomplete or placeholder strings do not cause errors during parameter serialization.
699-735: ImplementParamsChangeDataclassThe new
ParamsChangedataclass is well-designed, with appropriate default initialization ofonly_oldandonly_newas empty sets. The methods__bool__,__or__, and__str__are correctly implemented, providing useful functionality for parameter change detection and representation.snakemake/dag.py (2)
1306-1307: Ensurehas_outdated_metadataaccurately detects outdated jobsThe new condition
if self.workflow.persistence.has_outdated_metadata(job):introduces a check for outdated metadata. Confirm that thehas_outdated_metadatamethod correctly identifies when a job's metadata is outdated to prevent unnecessary job executions or missing necessary updates.
Line range hint
1309-1317: Ensure comprehensive handling of all rerun triggersMake sure that all specified rerun triggers (
PARAMS,INPUT,CODE,SOFTWARE_ENV) are thoroughly checked and thatreasonis updated accordingly for each. This guarantees that any changes necessitating a job rerun are appropriately captured.
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake/workflow.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/workflow.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
…D for storing metadata instead of local storage copy.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (6)
snakemake/persistence.py (4)
400-405: LGTM: Added metadata version checkThe
has_outdated_metadatamethod is a valuable addition for identifying outdated metadata. This can be crucial for triggering updates or migrations when necessary.Consider using
any()with a generator expression for a slight optimization:def has_outdated_metadata(self, job: Job) -> bool: return any( self.metadata(path).get("record_format_version", 0) < RECORD_FORMAT_VERSION for path in job.output )This change would short-circuit the evaluation as soon as an outdated record is found.
455-465: LGTM: Improved parameter change detectionThe refactoring of
params_changedmethod significantly improves the clarity and functionality of parameter change detection. The use of theParamsChangeclass allows for more detailed reporting of changes.Consider using a set comprehension for a slight optimization:
new = {param for param in self._params(job)}This change would create the set directly, potentially improving performance for larger parameter sets.
Line range hint
477-512: LGTM: Improved code change detection with version checkThe changes to
_code_changedmethod improve the reliability of code change detection by checking the record format version and focusing on shell commands. This avoids issues with Python version differences in code pickling.Note: This change might miss code changes in pure Python rules. Consider adding a comment explaining this limitation and the rationale behind the decision.
🧰 Tools
🪛 Ruff
506-506: Use of
functools.lru_cacheorfunctools.cacheon methods can lead to memory leaks(B019)
514-514: Use of
functools.lru_cacheorfunctools.cacheon methods can lead to memory leaks(B019)
702-738: LGTM: Well-designed ParamsChange dataclassThe new
ParamsChangedataclass is a well-designed addition that effectively encapsulates the concept of parameter changes. The methods for boolean evaluation, union operations, and string representation are comprehensive and well-implemented.Consider adding type hints to the methods for improved clarity:
def __bool__(self) -> bool: return bool(self.only_old or self.only_new) def __or__(self, other: 'ParamsChange') -> 'ParamsChange': # ... existing implementation ... def __str__(self) -> str: # ... existing implementation ...These type hints would enhance code readability and IDE support.
snakemake/workflow.py (2)
1322-1323: Good use of abstraction, but consider further refactoring.The
log_outdated_metadata_infomethod, likelog_missing_metadata_info, effectively uses the genericlog_metadata_infomethod. This approach is good for maintainability and follows the DRY principle.However, given the similarity between
log_missing_metadata_infoandlog_outdated_metadata_info, consider refactoring to reduce duplication further:You could create a single method that handles both cases:
def log_metadata_status(self, status): self.log_metadata_info(f"{status}_metadata", status) # Usage: self.log_metadata_status("missing") self.log_metadata_status("outdated")This approach would eliminate the need for separate methods while maintaining flexibility.
1325-1349: Informative method, but consider refactoring for improved structure and consistency.The
log_provenance_infomethod provides valuable information about provenance-triggered jobs and guidance on modifying this behavior. However, there are a few suggestions for improvement:
The method is quite long. Consider breaking it down into smaller, more focused methods for better readability and maintainability.
String formatting is inconsistent. Stick to f-strings throughout for better readability and performance.
The calls to
log_missing_metadata_infoandlog_outdated_metadata_infoat the end seem disconnected from the main purpose of this method. Consider moving these to a separate method or clarify their relationship to provenance info.Here's a suggested refactoring:
def log_provenance_info(self): self._log_provenance_triggered_jobs() self._log_additional_metadata_info() def _log_provenance_triggered_jobs(self): provenance_triggered_jobs = [ job for job in self.dag.needrun_jobs(exclude_finished=False) if self.dag.reason(job).is_provenance_triggered() ] if provenance_triggered_jobs: logger.info( "Some jobs were triggered by provenance information, " "see 'reason' section in the rule displays above.\n" "If you prefer that only modification time is used to " "determine whether a job shall be executed, use the command " "line option '--rerun-triggers mtime' (also see --help).\n" "If you are sure that a change for a certain output file (say, <outfile>) won't " "change the result (e.g. because you just changed the formatting of a script " "or environment definition), you can also wipe its metadata to skip such a trigger via " "'snakemake --cleanup-metadata <outfile>'. " ) logger.info( f"Rules with provenance triggered jobs: " f"{' '.join(jobs_to_rulenames(provenance_triggered_jobs))}" ) logger.info("") def _log_additional_metadata_info(self): self.log_missing_metadata_info() self.log_outdated_metadata_info()This refactoring improves readability and separates concerns more clearly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
- snakemake/persistence.py (11 hunks)
- snakemake/workflow.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
snakemake/persistence.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.snakemake/workflow.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
🔇 Additional comments (7)
snakemake/persistence.py (3)
7-7: LGTM: Improved imports and versioningThe new imports enhance type hinting, and the introduction of
RECORD_FORMAT_VERSIONsuggests a good practice for managing data structure changes.Also applies to: 20-20, 26-26, 31-31, 36-36
329-329: LGTM: Added record format versioningAdding the
record_format_versionto the record is a good practice for maintaining backward compatibility and facilitating future data migrations.
427-429: LGTM: Added record format version getterThe
record_format_versionmethod provides a clean and convenient way to access the record format version for a specific path. This addition enhances the API for version-related operations.snakemake/workflow.py (4)
1304-1317: Well-structured and efficient implementation.The
log_metadata_infomethod is a well-implemented helper function for logging metadata information about jobs. It uses efficient list comprehension for filtering jobs and modern f-strings for formatting the log message. The function is also generic, allowing it to handle different types of metadata attributes.
1319-1320: Good use of abstraction.The
log_missing_metadata_infomethod effectively uses the more genericlog_metadata_infoto handle the specific case of missing metadata. This approach follows the DRY principle and improves code maintainability.
Line range hint
1-1350: Summary: Valuable enhancements to metadata and provenance loggingThe changes to
snakemake/workflow.pyintroduce new logging capabilities for metadata and provenance information. These enhancements provide users with more detailed insights into job status and execution triggers. The implementation is generally well-structured and follows good coding practices.Key points:
- New helper methods for logging metadata information are concise and reusable.
- The
log_provenance_infomethod, while informative, could benefit from some refactoring for improved readability and maintainability.- Integration of new logging calls in the
executemethod is appropriate, but performance impact should be considered for large workflows.Overall, these changes should improve the user experience by providing more comprehensive information about the workflow execution process.
1247-1248: Improved logging with additional metadata and provenance information.The new method calls
self.log_missing_metadata_info(),self.log_outdated_metadata_info(), andself.log_provenance_info()enhance the logging capabilities of the workflow execution. They provide valuable information about the metadata status of jobs and provenance information.The placement of these calls within the
executemethod is appropriate:
- Logging missing and outdated metadata when there's nothing to be done.
- Logging provenance information in the dryrun case.
Consider the performance impact of these new logging calls, especially for large workflows. It might be beneficial to add some basic performance metrics. You can use the following script to check the execution time of these new logging methods:
This will help ensure that the new logging doesn't introduce significant overhead, especially for large workflows.
Also applies to: 1263-1263
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (5)
tests/test_params_outdated_code/Snakefile (2)
10-13: LGTM: Well-structured rule with advanced Snakemake features.This rule effectively demonstrates advanced Snakemake features:
- Use of lambda function in params
- Wildcard in output for dynamic file creation
- Complex shell command for testing
The structure is correct and serves the purpose of testing params syntax well.
Minor suggestion: Consider adding a name to this rule for better readability and easier referencing in larger workflows.
1-13: Effective test file for Snakemake params syntax.This Snakefile serves its purpose well as a test for params syntax in Snakemake:
- It demonstrates a simple rule structure with only input.
- It showcases a more complex rule with lambda functions in params, wildcards in output, and a multi-step shell command.
The file is concise yet covers different aspects of Snakemake syntax, making it a good unit test for the params feature. It aligns well with the PR objective of enhancing parameter change detection.
As the codebase grows, consider organizing these test Snakefiles into a structured test suite. This will help maintain test coverage and make it easier to run specific tests for different Snakemake features.
.github/workflows/main.yml (2)
67-67: LGTM! Consider version pinning for reproducibility.The addition of
auto-update-conda: trueensures that the CI environment uses the latest version of conda, which can include important bug fixes and improvements. This aligns well with the PR's objective of enhancing robustness.For improved reproducibility, consider pinning the conda version to a specific release instead of always using the latest. This can be done by specifying a version in the
miniforge-versionparameter. For example:- name: Setup snakemake environment uses: conda-incubator/setup-miniconda@v3 with: miniforge-version: 4.10.3-10 environment-file: test-environment-${{ matrix.py_ver }}.yml environment-name: snakemake auto-update-conda: falseThis approach balances the benefits of using a recent, known-good version of conda with the stability and reproducibility of a fixed environment.
161-161: LGTM! Consider version pinning for reproducibility (Windows).The addition of
auto-update-conda: truefor the Windows environment mirrors the change made for the Linux environment. This consistency is good for maintaining parity across different operating systems.As suggested for the Linux environment, consider pinning the conda version for the Windows environment as well. This ensures consistency and reproducibility across both platforms. For example:
- name: Setup snakemake environment uses: conda-incubator/setup-miniconda@v3 with: miniforge-version: 4.10.3-10 environment-file: test-environment.yml environment-name: snakemake auto-update-conda: falseThis approach would provide a consistent, reproducible environment for both Linux and Windows CI runs.
tests/tests.py (1)
267-271: New test function implemented correctly.The
test_params_outdated_code(mocker)function is well-implemented to test the behavior of thehas_outdated_metadatamethod in thePersistenceclass. It correctly uses themockerfixture to create a spy and runs the appropriate test case.However, there's a minor improvement that can be made to the assertion:
Consider changing the assertion to use Python's built-in boolean comparison instead of comparing to
Trueexplicitly:- assert spy.spy_return == True + assert spy.spy_returnThis change makes the code more idiomatic and slightly more efficient.
🧰 Tools
🪛 Ruff
271-271: Avoid equality comparisons to
True; useif spy.spy_return:for truth checksReplace with
spy.spy_return(E712)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (3)
tests/test_params_outdated_code/.snakemake/log/2024-10-14T123852.529917.snakemake.logis excluded by!**/*.logtests/test_params_outdated_code/expected-results/somedir/test.outis excluded by!**/*.outtests/test_params_outdated_code/somedir/test.outis excluded by!**/*.out
📒 Files selected for processing (4)
- .github/workflows/main.yml (2 hunks)
- tests/test_params_outdated_code/.snakemake/metadata/c29tZWRpci90ZXN0Lm91dA== (1 hunks)
- tests/test_params_outdated_code/Snakefile (1 hunks)
- tests/tests.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
tests/tests.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
🪛 Ruff
tests/tests.py
271-271: Avoid equality comparisons to
True; useif spy.spy_return:for truth checksReplace with
spy.spy_return(E712)
🔇 Additional comments (7)
tests/test_params_outdated_code/Snakefile (2)
1-5: LGTM: Clear docstring and appropriate shell executable setting.The docstring succinctly explains the purpose of the file, which is good for maintainability. Setting the shell executable to "bash" is a common and appropriate practice in Snakefiles.
7-8: Consider adding a name and output/action to the first rule.While Snakemake allows unnamed rules, adding a name would improve readability and make it easier to reference this rule. Additionally, this rule only specifies an input, which is unusual. Typically, rules have at least an output or action. Consider adding these elements to make the rule's purpose clearer.
To ensure this rule is intentional and not missing any elements, you can run the following command:
This will help verify if this input file is used elsewhere in the workflow.
tests/test_params_outdated_code/.snakemake/metadata/c29tZWRpci90ZXN0Lm91dA== (4)
1-1: Verify timestamp accuracyThe
starttimeandendtimevalues (1728902332.536915 and 1728902332.541915) correspond to dates in 2024, which is in the future. This might indicate an issue with the system clock or how timestamps are being generated.To check if this is an isolated incident or a pattern, run:
#!/bin/bash # Search for timestamps in other metadata files rg --type json '"starttime":\s*\d+\.\d+' .snakemake/metadata | head -n 10
1-1: Review empty input array and checksumsThe
inputarray andinput_checksumsobject are both empty. While this might be intentional, it's worth verifying:
- If the rule truly doesn't require any input files.
- If input file integrity checks are being skipped unintentionally.
To check if this is common in other metadata files, run:
#!/bin/bash # Count metadata files with empty input arrays and checksums echo "Files with empty input:" rg --type json '"input":\s*\[\]' .snakemake/metadata | wc -l echo "Files with empty input_checksums:" rg --type json '"input_checksums":\s*\{\}' .snakemake/metadata | wc -l
1-1:⚠️ Potential issueReview parameter formatting
The
paramsarray contains strings with single quotes:["'-f'", "'somedir'"]. This unusual formatting might cause issues when these parameters are used.Consider removing the single quotes:
-"params": ["'-f'", "'somedir'"], +"params": ["-f", "somedir"],To check if this is a consistent pattern in the codebase, run:
✅ Verification successful
Review Parameter Formatting Verified
The
paramsarray formatting issue is present in the specified file, and no similar patterns were found in other metadata files.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for similar parameter formatting in other metadata files rg --type json '"params":\s*\[.*'.*'.*\]' .snakemake/metadataLength of output: 393
1-1:⚠️ Potential issueCaution: Potentially risky shell command
The shell command
rm -r somedir; mkdir -p somedir; touch -f somedir/test.outinvolves removing and recreating a directory. This operation can be risky if not handled carefully, especially in a production environment.Consider the following suggestions:
- Use safer alternatives like
rm -rfto suppress errors if the directory doesn't exist.- Add error checking or use
set -eto ensure the script stops if any command fails.- Consider using
mktempfor creating temporary directories to avoid potential conflicts.To verify the usage of this command across the codebase, run:
✅ Verification successful
Verification Completed: No Widespread Risk Identified
The search for
rm -rcommands revealed usage only inDockerfileandMakefilefiles, which are appropriate contexts for such operations. No instances were found in Python files or other parts of the main codebase.
- The
rm -rcommand in the reviewed Snakemake metadata file appears to be an isolated case.- Ensure that such shell commands are used cautiously and are necessary for the workflow.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for similar rm -r commands in the codebase rg --type python "rm\s+-r"Length of output: 59
Script:
#!/bin/bash # Search for similar rm -r commands in all files rg "rm\s+-r" # Alternatively, search specifically in Python files by extension rg "rm\s+-r" --glob "*.py"Length of output: 198
tests/tests.py (1)
16-16: Import statement added correctly.The new import statement
from snakemake.persistence import Persistenceis correctly placed and necessary for the new test function.
🤖 I have created a release *beep* *boop* --- ## [8.23.0](v8.22.0...v8.23.0) (2024-10-14) ### Features * More robust parameter and code change detection and transparent reporting of detected changes ([#3140](#3140)). For setups using remote storage, this will lead to missing metadata records because the key used for storing the metadata is now the remote storage query instead of the path to the local copy. The reason is that the latter can be user-specific and therefore previously could have let led to e.g. different rerun behavior for different users. ### Documentation * update installation docs to reflect changes in conda ecosystem ([#3141](#3141)) ([e83b8aa](e83b8aa)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Johannes Köster <johannes.koester@tu-dortmund.de>
|
Hi Johannes! @johanneskoester Is the change to |



QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
New Features
ParamsChangedataclass for improved clarity in parameter change tracking.Improvements