fix: ignore derived parameters when storing job metadata and inferring rerun necessity of jobs (this gets rid of spurious triggers caused by e.g. changed resources, threads, remote storage configuration)#3154
Conversation
…g rerun necessity of jobs (this gets rid of spurious triggers caused by e.g. changed resources, threads, remote storage configuration)
WalkthroughThe pull request introduces enhancements to parameter management across several classes in the Snakemake framework. In Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
- snakemake/jobs.py (3 hunks)
- snakemake/persistence.py (4 hunks)
- snakemake/rules.py (9 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
snakemake/jobs.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.snakemake/persistence.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.snakemake/rules.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
🪛 Ruff
snakemake/persistence.py
15-15:
functools.partialimported but unusedRemove unused import:
functools.partial(F401)
586-586: Use of
functools.lru_cacheorfunctools.cacheon methods can lead to memory leaks(B019)
🔇 Additional comments (17)
snakemake/persistence.py (1)
587-587:⚠️ Potential issueAvoid using
functools.lru_cacheon instance methods to prevent memory leaks.Applying
@lru_cacheto instance methods can lead to memory leaks because theselfinstance is part of the cache key, and the cache can grow indefinitely with different instances. Consider refactoring_paramsto avoid using@lru_cacheon an instance method.Apply this diff to remove the decorator:
- @lru_cache() def _params(self, job: Job):If caching is necessary, consider using an instance-level caching mechanism or limiting the cache size.
Likely invalid or redundant comment.
snakemake/rules.py (12)
699-699: Correct Initialization ofnon_derived_itemsParameterThe optional parameter
non_derived_items: typing.List[typing.Any] = Noneis properly initialized toNone, which is a good practice to avoid mutable default arguments.
709-709: Initialization ofis_derivedVariableThe variable
is_derivedis appropriately initialized toFalse, ensuring that it has a defined state before usage.
722-723: Conditional Check for Deriving FunctionsThe code correctly checks if
non_derived_itemsis notNonebefore determining if the item is a deriving function:if non_derived_items is not None: is_derived = self._is_deriving_function(item)
746-751: Proper Use ofzipfor UnpackingThe use of
zipto constructapply_resultsensures that each list is correctly aligned, facilitating the unpacking process in subsequent iterations:apply_results = zip( [None] * len(item), item, [_is_callable] * len(item), [is_derived] * len(item), )
754-757: Buildingapply_resultsfrom Dictionary ItemsWhen
itemis a dictionary, the comprehension accurately constructsapply_resultsby pairing keys and values along with the callable and derived status:apply_results = [ (name, item, _is_callable, is_derived) for name, item in item.items() ]
759-759: Handling Single Items inapply_resultsFor single items,
apply_resultsis correctly formed as a list containing a single tuple:apply_results = [(name, item, _is_callable, is_derived)]
761-761: Correct Unpacking in LoopThe for-loop accurately unpacks
name,item,from_callable, andis_derivedfromapply_results, which is essential for processing each element:for name, item, from_callable, is_derived in apply_results:
785-786: Appending Non-Derived ItemsNon-derived items are properly appended to
non_derived_itemswhen they are identified as such:if not is_derived and non_derived_items is not None: non_derived_items.append(concrete)
848-859: Addition of_is_deriving_functionClass MethodThe new class method
_is_deriving_functioneffectively identifies deriving functions based on the presence of specific parameters, enhancing the classification of function types:@classmethod def _is_deriving_function(cls, func): if is_callable(func): func_params = get_function_params(func) return ( "input" in func_params or "output" in func_params or "threads" in func_params or "resources" in func_params ) return False
891-891: Initialization ofnon_derived_paramsListThe list
non_derived_paramsis correctly initialized to collect non-derived parameters during parameter expansion:non_derived_params = []
912-912: Passingnon_derived_itemsto_apply_wildcardsThe inclusion of
non_derived_items=non_derived_paramsin the_apply_wildcardsmethod call allows for tracking of non-derived parameters:non_derived_items=non_derived_params,
925-925: Returning Non-Derived ParametersThe
expand_paramsmethod now returns a tuple containing bothparamsandnon_derived_params, which enhances the method's functionality:return params, non_derived_paramssnakemake/jobs.py (4)
225-225: Addition of_non_derived_paramsto__slots__and initializationThe attribute
_non_derived_paramsis appropriately added to__slots__and initialized toNonein the constructor. This maintains consistency and optimizes memory usage in classes using__slots__.Also applies to: 279-279
424-424: Lazy initialization ofparamsusing_expand_paramsThe modification to the
paramsproperty to lazily initialize_paramsby calling_expand_params()enhances code maintainability and ensures that parameters are expanded only when needed.
427-432: Introduction ofnon_derived_paramsproperty for better encapsulationThe new
non_derived_paramsproperty provides access to_non_derived_paramswith lazy initialization, following the same pattern as theparamsproperty. This improves code clarity and encapsulation.
433-437:⚠️ Potential issueEnsure
Rule.expand_paramsreturns bothparamsandnon_derived_paramsThe
_expand_paramsmethod expectsself.rule.expand_params()to return a tuple of(params, non_derived_params). Please verify that theexpand_paramsmethod in theRuleclass has been updated accordingly to return both values to prevent aTypeError.To verify, check the implementation of
expand_paramsin theRuleclass to ensure it returns a tuple containing bothparamsandnon_derived_params.
|
🤖 I have created a release *beep* *boop* --- ## [8.23.2](v8.23.1...v8.23.2) (2024-10-18) ### Bug Fixes * ignore derived parameters when storing job metadata and inferring rerun necessity of jobs (this gets rid of spurious triggers caused by e.g. changed resources, threads, remote storage configuration) ([#3154](#3154)) ([73ce212](73ce212)) * more robust handling of input metadata that avoids storing user or typoe specific local paths, and properly handles pipe or service files ([#3150](#3150)) ([756dc70](756dc70)) ### Documentation * Fix missing spaces in CLI help text ([#3146](#3146)) ([6416c36](6416c36)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>



QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
New Features
Bug Fixes
Documentation