feat: allow to mark input files of rules as ancient via the API or command line interface (and thereby also via workflow specific profiles). Putting this into a workflow specific profile (or specifying as argument) allows to overrule rerun triggers caused by file modification dates where the user knows better.#3171
feat: allow to mark input files of rules as ancient via the API or command line interface (and thereby also via workflow specific profiles). Putting this into a workflow specific profile (or specifying as argument) allows to overrule rerun triggers caused by file modification dates where the user knows better.#3171johanneskoester merged 15 commits intomainfrom
Conversation
…mmand line interface (and thereby also via workflow specific profiles). Putting this into a workflow specific profile (or specifying as argument) allows to overrule rerun triggers caused by file modification dates where the user knows better.
📝 Walkthrough📝 WalkthroughWalkthroughThis pull request introduces enhancements to the Snakemake command-line interface and the internal handling of input items within the Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (3)
snakemake/settings/types.py (1)
202-204: Add documentation for the new field.The
consider_ancientfield lacks documentation explaining its purpose and expected values. Since this is a public API change, please add a docstring describing:
- Purpose of the field
- Format of the mapping (what the keys and values represent)
- How it integrates with the ancient file marking feature
consider_ancient: Mapping[str, AnySet[Union[str, int]]] = field( default_factory=dict - ) + ) + """Mapping of rule names to sets of input items that should be considered as ancient. + + Keys are rule names and values are sets containing either string paths or integer indices + of input items that should be marked as ancient, thereby preventing rerun triggers + based on modification times. + """snakemake/rules.py (1)
408-408: Add type annotation for mark_ancient parameter.Consider adding type annotation for better code clarity:
- def _set_inoutput_item(self, item, output=False, name=None, mark_ancient=False): + def _set_inoutput_item(self, item, output: bool = False, name: str | None = None, mark_ancient: bool = False):snakemake/cli.py (1)
760-771: Enhance help text with format examples.The argument is well-structured and placed appropriately. Consider enhancing the help text with explicit format examples to make it more user-friendly.
group_exec.add_argument( "--consider-ancient", metavar="RULE=INPUTITEMS", nargs="+", default=dict(), parse_func=parse_consider_ancient, - help="Consider given input items of given rules as ancient, i.e. not triggering " - "re-runs if they are newer than the output files. " - "Putting this into a workflow specific profile (or specifying as argument) " - "allows to overrule rerun triggers caused by file modification dates where the " - "user knows better.", + help="Consider given input items of given rules as ancient, i.e. not triggering " + "re-runs if they are newer than the output files. " + "Format: RULE=ITEM1,ITEM2,... where ITEM can be either an input name or " + "0-based index (e.g. 'myrule=input1,0,input2'). " + "Putting this into a workflow specific profile (or specifying as argument) " + "allows to overrule rerun triggers caused by file modification dates where the " + "user knows better.", )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
- snakemake/cli.py (3 hunks)
- snakemake/rules.py (5 hunks)
- snakemake/settings/types.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
snakemake/cli.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.snakemake/rules.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.snakemake/settings/types.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
🪛 Ruff
snakemake/cli.py
120-120: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
snakemake/settings/types.py
6-6:
typing.Listimported but unusedRemove unused import:
typing.List(F401)
🔇 Additional comments (2)
snakemake/rules.py (2)
275-283: LGTM: Input handling properly integrates ancient file marking.The implementation correctly retrieves ancient file settings from the workflow configuration and properly passes the mark_ancient flag to the underlying item processing.
490-491: LGTM: Ancient flag is properly applied to input items.The implementation correctly applies the ancient flag to input items when specified, maintaining consistency with the existing codebase while adding the new functionality.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
snakemake/settings/types.py (1)
127-129: Add documentation for the consider_ancient field.The new field would benefit from documentation in the
WorkflowSettingsclass docstring explaining:
- Purpose: How it's used to mark input files as ancient
- Key format: What the string keys represent (rule names?)
- Value format: What the set of strings/integers represent
- Usage: How it integrates with the API/CLI functionality
Example docstring addition:
""" Parameters ---------- consider_ancient: Mapping[str, AnySet[Union[str, int]]] Mapping of rule names to sets of input items that should be considered as ancient. Ancient inputs do not trigger reruns based on their modification time. """
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
- snakemake/cli.py (4 hunks)
- snakemake/rules.py (5 hunks)
- snakemake/settings/types.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- snakemake/rules.py
🧰 Additional context used
📓 Path-based instructions (2)
snakemake/cli.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.snakemake/settings/types.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
🪛 Ruff
snakemake/cli.py
120-120: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
snakemake/settings/types.py
6-6:
typing.Listimported but unusedRemove unused import:
typing.List(F401)
🔇 Additional comments (3)
snakemake/settings/types.py (1)
6-6: Skipping comment about unused List import.A previous review already correctly identified the unused
Listimport issue.🧰 Tools
🪛 Ruff
6-6:
typing.Listimported but unusedRemove unused import:
typing.List(F401)
snakemake/cli.py (2)
760-771: LGTM: CLI argument is well-defined.The
--consider-ancientargument is properly implemented with:
- Clear help text explaining purpose and usage
- Appropriate default value and parser function
- Consistent style with other similar arguments
2032-2032: LGTM: Proper integration with workflow settings.The consider_ancient parameter is correctly added to the WorkflowSettings constructor call.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
Please format your code with black: |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (2)
tests/test_ancient_cli/Snakefile (1)
10-17: Consider using named inputs for consistency.While the rule works correctly, consider using named inputs (like in rule B) for consistency across the test file.
rule A: input: - "1" + x="1" output: "2", shell: - "cat {input} > {output}" + "cat {input.x} > {output}"tests/tests.py (1)
241-246: Add docstring and consider additional test cases.The test implementation looks good but could be improved with:
- A docstring explaining the test's purpose and expected behavior
- Additional test cases covering edge cases like:
- Invalid input format
- Multiple ancient file specifications
- Interaction with other CLI flags
Example docstring and additional test cases:
def test_ancient_cli(): + """Test the --consider-ancient CLI flag for marking input files as ancient. + + Validates that input files can be marked as ancient through the command line + interface, which prevents them from triggering reruns based on timestamps. + """ run( dpath("test_ancient_cli"), shellcmd="snakemake --consider-ancient A=0 B=x", ) +def test_ancient_cli_invalid(): + """Test error handling for invalid --consider-ancient arguments.""" + run( + dpath("test_ancient_cli"), + shellcmd="snakemake --consider-ancient invalid_format", + shouldfail=True + ) + +def test_ancient_cli_multiple(): + """Test multiple --consider-ancient specifications.""" + run( + dpath("test_ancient_cli"), + shellcmd="snakemake --consider-ancient A=0 --consider-ancient B=x", + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (9)
- snakemake/cli.py (4 hunks)
- snakemake/rules.py (5 hunks)
- snakemake/settings/types.py (2 hunks)
- tests/test_ancient_cli/Snakefile (1 hunks)
- tests/test_ancient_cli/expected-results/1 (1 hunks)
- tests/test_ancient_cli/expected-results/2 (1 hunks)
- tests/test_ancient_cli/expected-results/3 (1 hunks)
- tests/test_ancient_cli/expected-results/4 (1 hunks)
- tests/tests.py (1 hunks)
✅ Files skipped from review due to trivial changes (4)
- tests/test_ancient_cli/expected-results/1
- tests/test_ancient_cli/expected-results/2
- tests/test_ancient_cli/expected-results/3
- tests/test_ancient_cli/expected-results/4
🚧 Files skipped from review as they are similar to previous changes (1)
- snakemake/settings/types.py
🧰 Additional context used
📓 Path-based instructions (3)
snakemake/cli.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.snakemake/rules.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.tests/tests.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
🪛 Ruff
snakemake/cli.py
134-134: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
🔇 Additional comments (10)
tests/test_ancient_cli/Snakefile (4)
1-3: Clever test setup for validating ancient file behavior.The initialization creates output files before input files with a time delay, which is an excellent test pattern for validating the --consider-ancient functionality. This ensures that input files (1,3) are newer than output files (2,4), which would typically trigger a rebuild without the ancient marking.
5-9: LGTM! Clear target definition.The rule correctly specifies the final targets that correspond to the outputs of rules A and B.
18-24: LGTM! Well-structured rule definition.The rule correctly uses named inputs and properly defines the file operation.
1-24: Verify test function implementation.This Snakefile provides good coverage for the --consider-ancient feature, but let's ensure there's proper test function implementation.
✅ Verification successful
Test implementation is properly in place
The test function
test_ancient_cliis implemented intests/tests.pyand correctly executes the Snakefile with the--consider-ancientflag, validating the functionality through the provided test case.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for corresponding test function echo "Searching for test_ancient_cli function in test files..." rg -l "def test_ancient_cli" tests/Length of output: 160
Script:
#!/bin/bash # Let's try a broader search to find test files and their content in the test_ancient_cli directory echo "Searching for test files in test_ancient_cli directory..." fd . tests/test_ancient_cli/ -t f -x cat {} # Also search for any test functions that might be related to ancient functionality echo -e "\nSearching for test functions related to ancient..." rg "def test.*ancient" tests/Length of output: 687
Script:
#!/bin/bash # Let's check the parent directory structure and any test files that might contain the tests echo "Checking parent directory structure..." ls -la tests/test_ancient_cli/ # Let's also check for any pytest files in the parent directory echo -e "\nChecking for test files in parent directory..." ls -la tests/ # Search for any conftest.py that might contain the test implementation echo -e "\nSearching for conftest.py files..." fd conftest.py tests/Length of output: 24879
Script:
#!/bin/bash # Let's check the content of common.py and conftest.py as they might contain the test implementation echo "Checking common.py for test implementation..." cat tests/common.py echo -e "\nChecking conftest.py for test implementation..." cat tests/conftest.py # Also check tests.py as it might contain the test implementation echo -e "\nChecking tests.py for test implementation..." rg "def test.*ancient" tests/tests.pyLength of output: 18997
Script:
#!/bin/bash # Let's check the implementation of the ancient test functions in tests.py echo "Checking test_ancient_cli implementation in tests.py..." rg "def test_ancient_cli" tests/tests.py -A 20Length of output: 594
snakemake/rules.py (4)
275-285: LGTM: Well-structured implementation of ancient file markingThe implementation correctly retrieves and applies the "consider_ancient" settings from workflow settings, handling both positional and keyword arguments appropriately.
410-410: LGTM: Clean implementation of ancient flag settingThe mark_ancient parameter is well integrated into the method, and the ancient flag is set at the appropriate point in the execution flow.
Also applies to: 492-493
473-475: LGTM: Proper handling of dependencies and temp filesThe code correctly manages rule dependencies and implements conditional temp file marking based on workflow settings.
Also applies to: 478-480
275-285: Verify the ancient flag handling across the codebaseLet's verify the consistency of ancient flag handling and its usage across the codebase.
Also applies to: 410-410, 492-493
✅ Verification successful
Ancient flag handling is properly integrated and consistent across the codebase
The verification shows that the ancient flag implementation is well-integrated:
- The
mark_ancientparameter in_set_inoutput_itemis used consistently for both positional and keyword inputs- Ancient flag handling is properly implemented in:
- CLI parsing via
--consider-ancientargument- Workflow settings via
WorkflowSettings.consider_ancient- DAG execution logic in
dag.pythat correctly handles ancient files- IO system with proper
is_ancientproperty checks- Documentation in
docs/snakefiles/rules.rstaccurately describes the behavior- Tests exist to verify the functionality (
test_ancient,test_ancient_cli,test_ancient_dag)🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the usage and handling of ancient flags # Test 1: Check for other occurrences of ancient flag handling echo "Checking for ancient flag handling..." rg "ancient" -A 3 -B 3 # Test 2: Check for potential mark_ancient parameter usage echo "Checking for mark_ancient parameter usage..." ast-grep --pattern 'mark_ancient'Length of output: 12876
snakemake/cli.py (2)
776-787: LGTM! Well-documented CLI argument.The argument definition is clear, follows the established pattern, and includes comprehensive help text that explains both the functionality and its practical use case.
2048-2048: LGTM! Proper integration with workflow settings.The consider_ancient parameter is correctly integrated into the WorkflowSettings constructor.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
docs/project_info/faq.rst (2)
684-687: Enhance explanation with specific examples.
The explanation of consistency checks is comprehensive, but could be more helpful with specific examples of each case (e.g., what constitutes a "semantic level" change).
Consider adding examples like:
This is based on file modification dates (input files may not be newer than output files of the same job), as well as execution metadata like the used software stack (e.g. conda env or container image), the non-file parameters, the set of input files, and the code of the rule.
+For example:
+- If you update a conda environment specification, Snakemake will rerun affected jobs
+- If you modify a rule's parameters, Snakemake will rerun that rule's jobs
+- If you change the set of input files, Snakemake will rerun the affected jobs701-707: Add a note about wildcard support in CLI example.
The CLI usage example is clear, but it would be helpful to mention if wildcards are supported in the rule name or input specification.
Consider adding:
This will mark the file ``inputfile.txt`` as ancient for the rule ``myrule``.
+Note that both rule names and input names support wildcards, e.g., ``snakemake --consider-ancient "align_*=*.fastq"``.📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
- docs/project_info/faq.rst (1 hunks)
- snakemake/profiles.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/profiles.py (1)
Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.
🔇 Additional comments (5)
snakemake/profiles.py (1)
56-56: LGTM! The addition of "consider-ancient" is well-integrated.
The change correctly adds "consider-ancient" to the list of keys that receive one-level dictionary formatting, consistent with similar configuration keys like "set-threads" and "resources". This enables the feature to mark input files as ancient via workflow profiles.
Let's verify the consistent handling of this new configuration key across the codebase:
✅ Verification successful
The "consider-ancient" configuration is properly integrated across the codebase
The verification confirms that the feature is consistently implemented:
- CLI: Properly defined with appropriate argument handling and error messages
- Settings: Type-safe implementation using a mapping type in settings/types.py
- Documentation: Included in FAQ with both CLI and profile configuration examples
- Profile: Correctly added to the special keys list (from original review)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the integration of the "consider-ancient" configuration
# Expected: Should find corresponding handling code in CLI and workflow settings
# Check for CLI argument handling
echo "Checking CLI argument handling..."
rg -l "consider-ancient" "snakemake/cli.py"
# Check for settings/configuration handling
echo "Checking settings handling..."
rg -l "consider_ancient" "snakemake/settings"
# Check for documentation
echo "Checking documentation updates..."
rg -l "consider-ancient" "docs/"
Length of output: 410
Script:
#!/bin/bash
# Let's examine the actual implementation and documentation to ensure consistency
# Check CLI implementation
echo "=== CLI Implementation ==="
rg "consider-ancient" -B 2 -A 2 snakemake/cli.py
# Check settings type definition
echo -e "\n=== Settings Type Definition ==="
rg "consider_ancient" -B 2 -A 2 snakemake/settings/types.py
# Check documentation content
echo -e "\n=== Documentation Content ==="
rg "consider-ancient" -B 2 -A 2 docs/project_info/faq.rst
Length of output: 1175
docs/project_info/faq.rst (4)
679-682: LGTM: Section reference and title are properly formatted.
The RST section reference and title follow the standard documentation format with proper underline length.
691-700: LGTM: Example rule is clear and well-formatted.
The example rule effectively demonstrates the concept with a realistic use case.
708-713: LGTM: Profile configuration example is clear and accurate.
The YAML configuration example correctly demonstrates how to persist the setting in a workflow-specific profile.
677-713: Documentation aligns well with PR objectives.
The new FAQ section effectively documents the feature for marking input files as ancient, which directly addresses the PR's objective of allowing users to override rerun triggers via the API and command line interface.
Let's verify the documentation matches the implementation:
✅ Verification successful
Documentation accurately reflects the implementation
The documentation in the FAQ section correctly describes both the command-line interface and profile configuration for marking input files as ancient. This is verified by:
- CLI implementation in
snakemake/cli.pymatches the documented syntax--consider-ancient myrule=foo - Profile configuration in
snakemake/profiles.pysupports the documented YAML format withconsider-ancientkey - Rule implementation in
snakemake/rules.pyproperly handles the ancient marking via_set_inoutput_item
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if the CLI option is properly implemented
rg --type py "consider.?ancient"
# Check if the profile configuration is properly implemented
rg --type py "consider.?ancient.*config"
Length of output: 1122
Script:
#!/bin/bash
# Let's verify the implementation details more thoroughly
# Check the CLI parser implementation
rg -A 10 "def parse_consider_ancient" snakemake/cli.py
# Check the profile loading implementation
rg -A 5 "consider-ancient" snakemake/profiles.py
# Check how ancient marking is handled in rules
rg -A 5 "_set_inoutput_item.*mark_ancient" snakemake/rules.py
Length of output: 1282
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
docs/project_info/faq.rst (1)
688-726: Consider enhancing the example with multiple input scenarios
While the documentation effectively explains how to mark input files as ancient both via CLI and workflow profiles, it could be more comprehensive by including:
- An example with multiple input files
- A note about the impact on downstream rules
Consider adding this additional example:
rule myrule:
input:
foo="inputfile.txt",
+ bar="another.txt"
output:
"outputfile.txt"
shell:
"somecommand {input.foo} {input.bar} > {output}"And the corresponding CLI example:
$ snakemake --consider-ancient myrule=foo,barAnd profile example:
consider-ancient:
- myrule: foo
+ myrule: [foo, bar]📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
- docs/project_info/faq.rst (1 hunks)
- snakemake/cli.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/cli.py (1)
Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.
📓 Learnings (1)
snakemake/cli.py (1)
Learnt from: johanneskoester
PR: snakemake/snakemake#3171
File: snakemake/cli.py:106-145
Timestamp: 2024-10-29T09:26:26.636Z
Learning: In the `snakemake` codebase, exception chaining (using `raise ... from ...`) is avoided due to backward compatibility reasons. Do not suggest using exception chaining in `snakemake` codebase.
🪛 Ruff
snakemake/cli.py
134-134: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🔇 Additional comments (6)
docs/project_info/faq.rst (3)
677-683: LGTM: Well-structured section header and placement
The new FAQ section is properly placed and follows the existing document structure with appropriate RST formatting.
684-687: LGTM: Clear explanation of Snakemake's consistency checks
The explanation accurately describes all factors that Snakemake considers when determining whether to rerun a rule:
- File modification dates
- Execution metadata
- Software stack
- Non-file parameters
- Input file set
- Rule code
679-680: LGTM: Proper RST reference label
The RST reference label consider_ancient is correctly formatted and placed, allowing cross-referencing from other documentation files.
snakemake/cli.py (3)
776-788: LGTM! Well documented CLI argument.
The implementation follows the established pattern for CLI arguments and provides clear documentation about the purpose and format of the --consider-ancient option.
2049-2049: LGTM! Proper integration with workflow settings.
The consider_ancient setting is correctly integrated into the WorkflowSettings class.
106-145: 🛠️ Refactor suggestion
Consider validating empty items in the comma-separated list.
The implementation is solid, but it could benefit from validating empty or whitespace-only items in the comma-separated list of input items. Currently, items.split(",") could produce empty strings that might cause issues.
Consider this improvement:
def parse_consider_ancient(
args: Optional[List[str]],
) -> Mapping[str, Set[Union[str, int]]]:
"""Parse command line arguments for marking input files as ancient.
Args:
args: List of RULE=INPUTITEMS pairs, where INPUTITEMS is a comma-separated list
of input item names or indices (0-based).
Returns:
A mapping of rules to sets of their ancient input items.
Raises:
ValueError: If the format is invalid or values cannot be parsed.
"""
errmsg = (
"Invalid --consider-ancient definition: entries have to be defined as "
"RULE=INPUTITEMS pairs, with INPUTITEMS being a list of input items of the "
"rule (given as name or index (0-based)), separated by commas."
)
def parse_item(item: str) -> Union[str, int]:
item = item.strip()
if not item:
raise ValueError(f"{errmsg} (Empty input item)")
try:
return int(item)
except ValueError:
if item.isidentifier():
return item
else:
raise ValueError(f"{errmsg} (Unparsable value: {repr(item)})")
consider_ancient = defaultdict(set)
if args is not None:
for entry in args:
rule, items = parse_key_value_arg(entry, errmsg=errmsg, strip_quotes=True)
if not rule.isidentifier():
raise ValueError(f"{errmsg} (Invalid rule name: {repr(rule)})")
- items = items.split(",")
+ items = [item for item in items.split(",") if item.strip()]
+ if not items:
+ raise ValueError(f"{errmsg} (No valid input items provided)")
consider_ancient[rule] = {parse_item(item) for item in items}
return consider_ancientLikely invalid or redundant comment.
🧰 Tools
🪛 Ruff
134-134: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 I have created a release *beep* *boop* --- ## [8.25.0](v8.24.1...v8.25.0) (2024-10-29) ### Features * add first 5 rules to group name (used e.g. when naming cluster/cloud jobs or logfiles) ([#3168](#3168)) ([5657122](5657122)) * allow to mark input files of rules as ancient via the API or command line interface (and thereby also via workflow specific profiles). Putting this into a workflow specific profile (or specifying as argument) allows to overrule rerun triggers caused by file modification dates where the user knows better. ([#3171](#3171)) ([6f3aed3](6f3aed3)) ### Bug Fixes * skip storage object when cloning flags for shadowed IO ([#3174](#3174)) ([d733fed](d733fed)) * use permission safe copying when hidden conda files are already present in a workdir. This avoids problems in case multiple people use the same workdir and workflow. ([#3169](#3169)) ([c98b2e7](c98b2e7)) ### Documentation * add tutorial references and small syntax fix ([#3172](#3172)) ([6bee12a](6bee12a)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>



QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
New Features
--consider-ancientcommand-line argument for enhanced control over input item handling.WorkflowSettingsto store mappings for "ancient" items.--consider-ancientoption.Bug Fixes
Enhancements
Ruleclass to support marking items as ancient.WorkflowSettingsfor better data handling capabilities.