fix: force check all required outputs#3341
Conversation
📝 WalkthroughWalkthroughThis pull request makes a series of modifications across the repository. In core modules, the checkpoint logic has been simplified by removing file existence checks, and the DAG logic has been reformatted for clarity. Conditional execution has been added for various GitHub Actions jobs based on pull request states. New I/O utility functions for parsing inputs and extracting checksums have been introduced, and the CLI help text has been clarified. Additionally, tests have been expanded, documentation has been updated, and asset metadata has been enhanced. Changes
Sequence Diagram(s)sequenceDiagram
participant WF as Workflow
participant SS as StorageSettings
participant DAG as DAG
WF->>SS: Check keep_storage_local attribute
alt keep_storage_local is False
WF->>DAG: Call cleanup_storage_objects()
else
WF->>WF: Skip cleanup
end
sequenceDiagram
participant PR as Pull Request Event
participant CI as GitHub Actions
participant Job as CI Job
PR->>CI: Trigger event (push/PR)
CI->>Job: Evaluate condition (merged != true || branch != main)
alt Condition met
Job->>CI: Execute job (formatting, testing, docs, etc.)
else
Job->>CI: Skip execution
end
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (5)
🚧 Files skipped from review as they are similar to previous changes (4)
🧰 Additional context used📓 Path-based instructions (1)`**/*.py`: Do not try to improve formatting. Do not suggest ...
⏰ Context from checks skipped due to timeout of 90000ms (30)
🔇 Additional comments (1)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
snakemake/checkpoints.py (1)
37-41: Core checkpoint fix: checking all outputs against created_output.This is the core fix of the PR - properly checking if all the outputs (rather than just any) are in
created_outputinstead of checking againstfuture_output. This ensures that all required outputs from the checkpoint rule are created before proceeding.The nested
ifstatements could be simplified to a single condition for better readability.- if self.checkpoints.created_output is not None: - if set(output) <= set(self.checkpoints.created_output): + if self.checkpoints.created_output is not None and set(output) <= set(self.checkpoints.created_output): return CheckpointJob(self.rule, output)🧰 Tools
🪛 Ruff (0.8.2)
39-40: Use a single
ifstatement instead of nestedifstatements(SIM102)
tests/test_checkpoints_many/Snakefile (1)
56-56: Typo in checkpoint usage pattern.There's a space between the
**andwildcardsin the checkpoint call which should be removed for consistency with Python's unpacking syntax.- outputs_i = glob.glob(f"{checkpoints.first.get(** wildcards).output}/*/") + outputs_i = glob.glob(f"{checkpoints.first.get(**wildcards).output}/*/")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
snakemake/checkpoints.py(4 hunks)snakemake/dag.py(0 hunks)tests/test_checkpoints_many/Snakefile(1 hunks)tests/tests.py(1 hunks)
💤 Files with no reviewable changes (1)
- snakemake/dag.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
snakemake/checkpoints.pytests/tests.py
🪛 Ruff (0.8.2)
snakemake/checkpoints.py
39-40: Use a single if statement instead of nested if statements
(SIM102)
⏰ Context from checks skipped due to timeout of 90000ms (40)
- GitHub Check: testing (10, 3.12, dash)
- GitHub Check: testing (10, 3.12, bash)
- GitHub Check: testing (10, 3.11, bash)
- GitHub Check: testing (9, 3.12, dash)
- GitHub Check: testing (9, 3.12, bash)
- GitHub Check: testing (9, 3.11, bash)
- GitHub Check: testing (8, 3.12, dash)
- GitHub Check: testing (8, 3.12, bash)
- GitHub Check: testing (8, 3.11, bash)
- GitHub Check: testing (7, 3.12, dash)
- GitHub Check: testing (7, 3.12, bash)
- GitHub Check: testing (7, 3.11, bash)
- GitHub Check: testing (6, 3.12, dash)
- GitHub Check: testing (6, 3.12, bash)
- GitHub Check: testing (6, 3.11, bash)
- GitHub Check: testing (5, 3.12, dash)
- GitHub Check: testing (5, 3.12, bash)
- GitHub Check: testing (5, 3.11, bash)
- GitHub Check: testing (4, 3.12, dash)
- GitHub Check: testing (4, 3.12, bash)
- GitHub Check: testing (4, 3.11, bash)
- GitHub Check: testing (3, 3.12, dash)
- GitHub Check: testing-windows (10)
- GitHub Check: testing (3, 3.12, bash)
- GitHub Check: testing-windows (9)
- GitHub Check: testing (3, 3.11, bash)
- GitHub Check: testing-windows (8)
- GitHub Check: testing-windows (7)
- GitHub Check: testing (2, 3.12, dash)
- GitHub Check: testing-windows (6)
- GitHub Check: testing (2, 3.12, bash)
- GitHub Check: testing-windows (5)
- GitHub Check: testing (2, 3.11, bash)
- GitHub Check: testing-windows (4)
- GitHub Check: testing (1, 3.12, dash)
- GitHub Check: testing-windows (3)
- GitHub Check: testing (1, 3.12, bash)
- GitHub Check: testing-windows (2)
- GitHub Check: testing (1, 3.11, bash)
- GitHub Check: testing-windows (1)
🔇 Additional comments (5)
tests/tests.py (1)
1353-1355: LGTM! Good test addition.Adding a test for the more complex checkpoint scenario is a good practice to ensure the fix works with multiple checkpoints. This will help verify that the checkpoint output validation works correctly.
snakemake/checkpoints.py (3)
1-8: Type imports look good.The addition of proper typing imports and the TYPE_CHECKING block for forward references is a good practice to avoid circular imports while providing type hints.
26-26: Adding type hints improves code readability.The addition of type hints for the constructor parameters is a good practice that enhances code readability and enables better IDE support.
49-49: Type hint for CheckpointJob constructor.Consistent with the additions to the Checkpoint class, adding type hints to the CheckpointJob constructor improves code quality.
tests/test_checkpoints_many/Snakefile (1)
1-78: Well-structured test Snakefile for multiple checkpoints.This Snakefile creates a comprehensive test case with multiple checkpoints in sequence, which is excellent for verifying the fix. The workflow tests a variety of checkpoint scenarios:
- Using multiple checkpoints in sequence
- Glob-based dynamic output discovery
- Nested wildcard propagation
- Multiple output files from checkpoints
The
aggregatefunction properly validates that checkpoint outputs exist and combines the results from both checkpoints.
d7d5730 to
6ecc84e
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/test_checkpoints_many/Snakefile (1)
74-74: Missing line termination.Line 74 appears to be incomplete, lacking a newline character at the end of the file. This could potentially cause issues with certain tools or systems that expect files to end with a newline.
- touch("collect/{sample}/all_done.txt"), + touch("collect/{sample}/all_done.txt"), +
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
snakemake/checkpoints.py(4 hunks)snakemake/dag.py(0 hunks)tests/test_checkpoints_many/Snakefile(1 hunks)tests/tests.py(1 hunks)
💤 Files with no reviewable changes (1)
- snakemake/dag.py
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/tests.py
- snakemake/checkpoints.py
⏰ Context from checks skipped due to timeout of 90000ms (40)
- GitHub Check: testing (10, 3.12, dash)
- GitHub Check: testing (10, 3.12, bash)
- GitHub Check: testing (10, 3.11, bash)
- GitHub Check: testing (9, 3.12, dash)
- GitHub Check: testing (9, 3.12, bash)
- GitHub Check: testing (9, 3.11, bash)
- GitHub Check: testing (8, 3.12, dash)
- GitHub Check: testing (8, 3.12, bash)
- GitHub Check: testing (8, 3.11, bash)
- GitHub Check: testing (7, 3.12, dash)
- GitHub Check: testing (7, 3.12, bash)
- GitHub Check: testing (7, 3.11, bash)
- GitHub Check: testing (6, 3.12, dash)
- GitHub Check: testing (6, 3.12, bash)
- GitHub Check: testing (6, 3.11, bash)
- GitHub Check: testing (5, 3.12, dash)
- GitHub Check: testing (5, 3.12, bash)
- GitHub Check: testing (5, 3.11, bash)
- GitHub Check: testing (4, 3.12, dash)
- GitHub Check: testing (4, 3.12, bash)
- GitHub Check: testing (4, 3.11, bash)
- GitHub Check: testing (3, 3.12, dash)
- GitHub Check: testing-windows (10)
- GitHub Check: testing (3, 3.12, bash)
- GitHub Check: testing-windows (9)
- GitHub Check: testing (3, 3.11, bash)
- GitHub Check: testing-windows (8)
- GitHub Check: testing (2, 3.12, dash)
- GitHub Check: testing-windows (7)
- GitHub Check: testing (2, 3.12, bash)
- GitHub Check: testing-windows (6)
- GitHub Check: testing (2, 3.11, bash)
- GitHub Check: testing-windows (5)
- GitHub Check: testing (1, 3.12, dash)
- GitHub Check: testing-windows (4)
- GitHub Check: testing (1, 3.12, bash)
- GitHub Check: testing-windows (3)
- GitHub Check: testing (1, 3.11, bash)
- GitHub Check: testing-windows (2)
- GitHub Check: testing-windows (1)
🔇 Additional comments (2)
tests/test_checkpoints_many/Snakefile (2)
50-67: The aggregate function provides thorough checkpoint validation.This function effectively demonstrates the intended usage pattern for checkpoint management, traversing through outputs from both checkpoints and creating a comprehensive collection of dependent files. The assertion on line 59 is particularly valuable for testing, ensuring that the expected checkpoint outputs exist.
1-74: Well-structured test for validating checkpoint functionality.This test file effectively validates the PR objective of checking whether all outputs specified by rules are created, rather than relying on
future_output. The workflow creates a multi-step dependency chain with two checkpoints and demonstrates proper aggregation of outputs.The test structure enables validation of the checkpoint functionality by:
- Creating initial sample files
- Generating nested directory structures in two checkpoint stages
- Copying and aggregating the results
- Verifying all expected outputs exist
This provides good coverage for testing the changes made in PR #3341 to address issue #3036.
6ecc84e to
dfe767a
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
tests/test_checkpoints_many/Snakefile (3)
3-3: Remove unused import.The
randommodule is imported but never used in this Snakefile.import glob -import random from pathlib import Path
15-15: Consider using expand() for output files.Using a list directly as
outputis unusual in Snakemake. For consistency and clarity, consider using theexpand()function or a list comprehension here.- output: - ALL_SAMPLES, + output: + expand("{sample}", sample=ALL_SAMPLES),
53-55: Remove unnecessary blank line.You have an empty line between related operations. For better code readability, consider removing this blank line.
outputs_i = glob.glob(f"{checkpoints.first.get(**wildcards).output}/*/") - outputs_i = [output.split("/")[-2] for output in outputs_i]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
snakemake/checkpoints.py(4 hunks)snakemake/dag.py(0 hunks)tests/test_checkpoints_many/Snakefile(1 hunks)tests/tests.py(1 hunks)
💤 Files with no reviewable changes (1)
- snakemake/dag.py
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/tests.py
- snakemake/checkpoints.py
⏰ Context from checks skipped due to timeout of 90000ms (40)
- GitHub Check: testing (10, 3.12, dash)
- GitHub Check: testing (10, 3.12, bash)
- GitHub Check: testing (10, 3.11, bash)
- GitHub Check: testing (9, 3.12, dash)
- GitHub Check: testing (9, 3.12, bash)
- GitHub Check: testing (9, 3.11, bash)
- GitHub Check: testing (8, 3.12, dash)
- GitHub Check: testing (8, 3.12, bash)
- GitHub Check: testing (8, 3.11, bash)
- GitHub Check: testing (7, 3.12, dash)
- GitHub Check: testing (7, 3.12, bash)
- GitHub Check: testing (7, 3.11, bash)
- GitHub Check: testing (6, 3.12, dash)
- GitHub Check: testing (6, 3.12, bash)
- GitHub Check: testing (6, 3.11, bash)
- GitHub Check: testing (5, 3.12, dash)
- GitHub Check: testing (5, 3.12, bash)
- GitHub Check: testing (5, 3.11, bash)
- GitHub Check: testing (4, 3.12, dash)
- GitHub Check: testing (4, 3.12, bash)
- GitHub Check: testing (4, 3.11, bash)
- GitHub Check: testing (3, 3.12, dash)
- GitHub Check: testing-windows (10)
- GitHub Check: testing (3, 3.12, bash)
- GitHub Check: testing-windows (9)
- GitHub Check: testing (3, 3.11, bash)
- GitHub Check: testing-windows (8)
- GitHub Check: testing (2, 3.12, dash)
- GitHub Check: testing-windows (7)
- GitHub Check: testing (2, 3.12, bash)
- GitHub Check: testing-windows (6)
- GitHub Check: testing (2, 3.11, bash)
- GitHub Check: testing-windows (5)
- GitHub Check: testing (1, 3.12, dash)
- GitHub Check: testing-windows (4)
- GitHub Check: testing (1, 3.12, bash)
- GitHub Check: testing-windows (3)
- GitHub Check: testing (1, 3.11, bash)
- GitHub Check: testing-windows (2)
- GitHub Check: testing-windows (1)
🔇 Additional comments (2)
tests/test_checkpoints_many/Snakefile (2)
52-52: Fix spacing in the get method parameter.There's a syntax error with the extra space between the
**operator andwildcards. This would cause a syntax error when executing the workflow.- outputs_i = glob.glob(f"{checkpoints.first.get(** wildcards).output}/*/") + outputs_i = glob.glob(f"{checkpoints.first.get(**wildcards).output}/*/")
56-66: The aggregate function correctly tests the checkpoint mechanism.This implementation aligns well with the PR objective to directly check whether all outputs specified by the rule are created. It properly verifies the existence of outputs from both checkpoints and validates that the workflow is correctly propagating wildcards between checkpoint stages.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (6)
tests/test_checkpoints_many/Snakefile (6)
1-3: Unused import detected.The
randommodule is imported but not used anywhere in this Snakefile.import glob -import random from pathlib import Path
15-15: Consider using a more explicit output specification.Using the variable directly as output is a bit unusual and might be confusing. Consider using a more explicit pattern that shows these are files.
- ALL_SAMPLES, + expand("{sample}", sample=ALL_SAMPLES),
28-29: Good use of Path for directory creation, but consider adding error handling.The code correctly uses
Path().mkdir()withparents=Trueto create directories. Consider adding error handling for potential I/O errors.- Path(f"{output[0]}/{i}").mkdir(parents=True, exist_ok=True) - Path(f"{output[0]}/{i}/test.txt").touch() + try: + Path(f"{output[0]}/{i}").mkdir(parents=True, exist_ok=True) + Path(f"{output[0]}/{i}/test.txt").touch() + except IOError as e: + raise WorkflowError(f"Error creating checkpoint output: {e}")
58-59: Good checkpoint verification but consider enhancing error message.The assertion verifies that the checkpoint output exists, which aligns with the PR objective of checking all required outputs. Consider adding a more descriptive error message.
- assert Path(s2out).exists() + assert Path(s2out).exists(), f"Checkpoint output directory '{s2out}' does not exist"
63-65: Consider simplifying the expand pattern.The
expandcall could be simplified since you're only expanding a single wildcard.- split_files.extend( - expand(f"copy/{{sample}}/{i}/{j}/test2.txt", sample=wildcards.sample) - ) + split_files.append(f"copy/{wildcards.sample}/{i}/{j}/test2.txt")
73-74: Missing newline at end of file.Add a newline at the end of the file to follow standard file format conventions.
touch("collect/{sample}/all_done.txt"), +
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/test_checkpoints_many/Snakefile(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (18)
- GitHub Check: testing (10, 3.12, dash)
- GitHub Check: testing (10, 3.12, bash)
- GitHub Check: testing (10, 3.11, bash)
- GitHub Check: testing-windows (10)
- GitHub Check: testing-windows (9)
- GitHub Check: testing-windows (8)
- GitHub Check: testing-windows (7)
- GitHub Check: testing (2, 3.12, bash)
- GitHub Check: testing-windows (6)
- GitHub Check: testing (2, 3.11, bash)
- GitHub Check: testing-windows (5)
- GitHub Check: testing (1, 3.12, dash)
- GitHub Check: testing-windows (4)
- GitHub Check: testing (1, 3.12, bash)
- GitHub Check: testing-windows (3)
- GitHub Check: testing (1, 3.11, bash)
- GitHub Check: testing-windows (2)
- GitHub Check: testing-windows (1)
🔇 Additional comments (2)
tests/test_checkpoints_many/Snakefile (2)
52-52: Fix spacing in the get method parameter.There's a syntax error with the extra space between the
**operator andwildcards. This would cause a syntax error when executing the workflow.- outputs_i = glob.glob(f"{checkpoints.first.get(** wildcards).output}/*/") + outputs_i = glob.glob(f"{checkpoints.first.get(**wildcards).output}/*/")
50-66: Excellent implementation of checkpoint dependency tracking.The
aggregatefunction effectively implements the PR's objective by directly checking that required outputs from checkpoints exist. It correctly usescheckpoint.get().outputto access the outputs and verifies their existence withPath().exists(), which is the core change mentioned in the PR objectives.
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
🧹 Nitpick comments (9)
.github/workflows/main.yml (2)
52-55: Empty 'exclude' SectionThe
exclude: []block is currently empty. Static analysis (actionlint) signals that an empty exclude section may be unintentional or unnecessary. If no exclusions are needed, consider removing this block to simplify the configuration and avoid potential warnings. Otherwise, populate it with the intended exclusions.🧰 Tools
🪛 actionlint (1.7.4)
52-52: "exclude" section should not be empty
(syntax-check)
🪛 YAMLlint (1.35.1)
[error] 53-53: trailing spaces
(trailing-spaces)
211-218: Addition of 'testing-done' Job and Final Newline ReminderThe new
testing-donejob serves as a final confirmation step by echoing "All tests passed" when the preceding jobs (testingandtesting-windows) complete successfully. This addition enhances clarity in the CI pipeline.Note: YAMLlint reported a missing newline at the end of the file (line 218). Please ensure that a newline character is added at the end to comply with the YAML specification and avoid linting warnings.
🧰 Tools
🪛 YAMLlint (1.35.1)
[error] 218-218: no new line character at the end of file
(new-line-at-end-of-file)
snakemake/workflow.py (1)
1813-1814: Minor style improvement suggestion.The inequality comparison with
Falsecould be replaced with a more Pythonic truth check.-if not ruleinfo.template_engine and ruleinfo.container_img != False: +if not ruleinfo.template_engine and ruleinfo.container_img:🧰 Tools
🪛 Ruff (0.8.2)
1813-1813: Avoid inequality comparisons to
False; useif ruleinfo.container_img:for truth checksReplace with
ruleinfo.container_img(E712)
snakemake/ioutils/input.py (1)
18-28: Consider handling missing files more gracefully.The current implementation will raise a KeyError if the specified file is not found in the checksum file, which might not provide a clear error message to users. Consider adding a more specific error handling.
return ( pd.read_csv( infile, sep=" ", header=None, engine="python", converters={1: fix_file_name}, ) .set_index(1) - .loc[fix_file_name(kwargs.get("file"))] - .item() + .loc[fix_file_name(kwargs.get("file")), 0] + if fix_file_name(kwargs.get("file")) in pd.read_csv( + infile, + sep=" ", + header=None, + engine="python", + converters={1: fix_file_name}, + ).set_index(1).index else + raise WorkflowError(f"File {kwargs.get('file')} not found in checksum file {infile}") )docs/snakefiles/rules.rst (2)
550-567: Review ofparse_inputDocumentation
The new documentation for theparse_inputfunction is clear and concise. It explains the purpose, signature, and usage with an example. Consider explicitly mentioning that the third parameter, kwargs, is expected to be a dictionary (or that it supports arbitrary keyword arguments via**kwargs) for even greater clarity.
573-589: Review ofextract_checksumDocumentation
This section clearly outlines the purpose and signature of theextract_checksumfunction along with an example usage. It might be beneficial to briefly note (or link to) which checksum algorithm is applied, if that information is available, to help users understand potential performance or compatibility considerations.docs/snakefiles/reporting.rst (3)
104-110: Defining File Labels Section
This section introduces the concept of assigning human‐friendly labels to output files. The explanation and accompanying example (the modified rule “b”) help illustrate how one can abstract away technical file details in the report. Consider adding a brief note on the benefits of this approach for nontechnical report consumers.
141-143: Dynamic Determination of Categories and Labels
The explanation of how to dynamically determinecategory,subcategory, andlabelsvia functions is comprehensive and informative. The header is descriptive, although its length might be reduced for brevity.
151-153: Subsection: From Captions
The “From captions” subsection is a concise introduction to the hyperlink mechanism for captions. It could be even more helpful with a small inline example or a direct reference to further detailed documentation.
🛑 Comments failed to post (1)
snakemake/ioutils/input.py (1)
12-30:
⚠️ Potential issueThe
extract_checksumfunction needs some improvements.There are several issues with this implementation:
- Missing import for
WorkflowError- Using lambda for function definition instead of a regular def
- Missing exception chaining for better error tracking
- No docstring explaining the function's purpose and parameters
+from snakemake.exceptions import WorkflowError def extract_checksum(infile, **kwargs): + """Extract checksum from a file for a specified file. + + Args: + infile: Path to the file containing checksums + **kwargs: Additional arguments, including 'file' to specify target filename + + Returns: + The checksum for the specified file + """ try: import pandas as pd - fix_file_name = lambda x: x.removeprefix("./") + def fix_file_name(x): + return x.removeprefix("./") + return ( pd.read_csv( infile, sep=" ", header=None, engine="python", converters={1: fix_file_name}, ) .set_index(1) .loc[fix_file_name(kwargs.get("file"))] .item() ) except ImportError as err: - raise WorkflowError("Pandas is required to extract checksum from file.") + raise WorkflowError("Pandas is required to extract checksum from file.") from err📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from snakemake.exceptions import WorkflowError def extract_checksum(infile, **kwargs): """Extract checksum from a file for a specified file. Args: infile: Path to the file containing checksums **kwargs: Additional arguments, including 'file' to specify target filename Returns: The checksum for the specified file """ try: import pandas as pd def fix_file_name(x): return x.removeprefix("./") return ( pd.read_csv( infile, sep=" ", header=None, engine="python", converters={1: fix_file_name}, ) .set_index(1) .loc[fix_file_name(kwargs.get("file"))] .item() ) except ImportError as err: raise WorkflowError("Pandas is required to extract checksum from file.") from err🧰 Tools
🪛 Ruff (0.8.2)
16-16: Do not assign a
lambdaexpression, use adefRewrite
fix_file_nameas adef(E731)
30-30: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
30-30: Undefined name
WorkflowError(F821)
c6a773d to
6426310
Compare
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
🧹 Nitpick comments (10)
snakemake/report/html_reporter/data/packages.py (1)
41-185: Avoid manual duplication of asset references.
All these newly introduced packages match the corresponding entries withinsnakemake/assets/__init__.py. To reduce the risk of mismatches, you could generate these references programmatically fromAssets.spec, ensuring both files stay in sync as you update versions and checksums.snakemake/ioutils/input.py (1)
1-9: Implementation looks good, but consider adding type hints and docstringThe
parse_inputfunction implementation is correct and provides a clean way to parse input files with a custom parser. Consider adding docstring and type hints to improve code readability and IDE support.-def parse_input(infile, parser, **kwargs): +def parse_input(infile, parser=None, **kwargs): + """Parse an input file with an optional parser function. + + Args: + infile: Path to the input file + parser: Optional function to parse the file content + **kwargs: Additional arguments passed to the parser + + Returns: + A function that reads the file and applies the parser if provided + """ def inner(wildcards, input, output): with open(infile, "r") as fh: if parser is None: return fh.read().strip() else: return parser(fh, **kwargs) return innertests/test_validate/Snakefile (1)
41-50: Partial validation of Polars LazyFrame.Validating only the first 1000 records is acceptable for large datasets, but consider whether additional rows need checking.
snakemake/utils.py (3)
118-149: Pandas DataFrame row-by-row validation is solid.One minor improvement is to raise exceptions using Python’s preferred exception chaining.
Apply this diff in the except clause to follow best practices:
-raise WorkflowError( - f"Error validating row {i} of data frame.", e -) +raise WorkflowError( + f"Error validating row {i} of data frame." +) from e🧰 Tools
🪛 Ruff (0.8.2)
132-134: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
150-218: Polars DataFrame validation approach is well-structured.Similar improvement for exception chaining is recommended in both exception blocks:
-raise WorkflowError( - f"Error validating row {i} of data frame.", e -) +raise WorkflowError( + f"Error validating row {i} of data frame." +) from eAnd similarly for the lazy validation block:
-raise WorkflowError( - f"Error validating row {i} of data frame.", e -) +raise WorkflowError( + f"Error validating row {i} of data frame." +) from e🧰 Tools
🪛 Ruff (0.8.2)
168-170: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
206-208: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
219-226: Config dictionary validation is straightforward.Again, consider adopting the “raise ... from e” format for clarity:
-raise WorkflowError("Error validating config file.", e) +raise WorkflowError("Error validating config file.") from e🧰 Tools
🪛 Ruff (0.8.2)
224-224: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
.github/workflows/main.yml (4)
51-54: EmptyexcludeList and Commented-Out Shell OptionAn empty
exclude: []is now present, and the option for thedashshell has been commented out. Note that the static analysis tool flagged an issue with an emptyexclude; if this key isn’t needed, consider removing it or adding an inline comment to clarify its purpose.🧰 Tools
🪛 actionlint (1.7.4)
52-52: "exclude" section should not be empty
(syntax-check)
🪛 YAMLlint (1.35.1)
[error] 53-53: trailing spaces
(trailing-spaces)
211-218: New "testing-done" Job AdditionThe new
testing-donejob, which echoes "All tests passed," follows the same conditional logic and depends on both the testing and testing-windows jobs. This addition appears useful for signaling overall test success, but ensure that its placement in the dependency chain doesn’t accidentally mask upstream failures.🧰 Tools
🪛 YAMLlint (1.35.1)
[error] 218-218: no new line character at the end of file
(new-line-at-end-of-file)
53-53: Trailing Spaces DetectedYAMLlint reported trailing spaces on this line. Please remove any trailing whitespace to conform to the YAML formatting standards.
🧰 Tools
🪛 YAMLlint (1.35.1)
[error] 53-53: trailing spaces
(trailing-spaces)
218-218: Missing Newline at End of FileA newline character is missing at the end of this file. Adding a newline will help avoid potential issues with some parsers or tools that expect a final newline.
🧰 Tools
🪛 YAMLlint (1.35.1)
[error] 218-218: no new line character at the end of file
(new-line-at-end-of-file)
🛑 Comments failed to post (1)
snakemake/ioutils/input.py (1)
12-30:
⚠️ Potential issueFix the undefined WorkflowError and lambda expression issues
This function has several issues that need to be addressed:
WorkflowErroris undefined - need to import it- Using lambda assignment is discouraged (E731)
- Exception handling should use
from errsyntax- Consider adding docstring and type hints
+from snakemake.exceptions import WorkflowError def extract_checksum(infile, **kwargs): + """Extract checksum from a file for a given filename. + + Args: + infile: Path to the checksum file + **kwargs: Must contain 'file' key with the filename to look up + + Returns: + The checksum for the specified file + + Raises: + WorkflowError: If pandas is not available + """ try: import pandas as pd - fix_file_name = lambda x: x.removeprefix("./") + def fix_file_name(x): + return x.removeprefix("./") + return ( pd.read_csv( infile, sep=" ", header=None, engine="python", converters={1: fix_file_name}, ) .set_index(1) .loc[fix_file_name(kwargs.get("file"))] .item() ) - except ImportError: - raise WorkflowError("Pandas is required to extract checksum from file.") + except ImportError as err: + raise WorkflowError("Pandas is required to extract checksum from file.") from err📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from snakemake.exceptions import WorkflowError def extract_checksum(infile, **kwargs): """Extract checksum from a file for a given filename. Args: infile: Path to the checksum file **kwargs: Must contain 'file' key with the filename to look up Returns: The checksum for the specified file Raises: WorkflowError: If pandas is not available """ try: import pandas as pd def fix_file_name(x): return x.removeprefix("./") return ( pd.read_csv( infile, sep=" ", header=None, engine="python", converters={1: fix_file_name}, ) .set_index(1) .loc[fix_file_name(kwargs.get("file"))] .item() ) except ImportError as err: raise WorkflowError("Pandas is required to extract checksum from file.") from err🧰 Tools
🪛 Ruff (0.8.2)
16-16: Do not assign a
lambdaexpression, use adefRewrite
fix_file_nameas adef(E731)
30-30: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
30-30: Undefined name
WorkflowError(F821)
There was a problem hiding this comment.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
snakemake/ioutils/input.py (1)
12-31:⚠️ Potential issueMissing exception handling and imports in
extract_checksumfunctionThe function contains several issues that need to be addressed:
WorkflowErroris undefined - the import is missing- Lambda function is assigned to a variable instead of using a proper function definition
- Exception should be raised with
from errsyntax to preserve the tracebackApply these fixes:
+from snakemake.exceptions import WorkflowError def extract_checksum(infile, **kwargs): try: import pandas as pd - fix_file_name = lambda x: x.removeprefix("./") + def fix_file_name(x): + return x.removeprefix("./") return ( pd.read_csv( infile, sep=" ", header=None, engine="python", converters={1: fix_file_name}, ) .set_index(1) .loc[fix_file_name(kwargs.get("file"))] .item() ) except ImportError as err: - raise WorkflowError("Pandas is required to extract checksum from file.") + raise WorkflowError("Pandas is required to extract checksum from file.") from err🧰 Tools
🪛 Ruff (0.8.2)
16-16: Do not assign a
lambdaexpression, use adefRewrite
fix_file_nameas adef(E731)
30-30: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
30-30: Undefined name
WorkflowError(F821)
♻️ Duplicate comments (1)
tests/test_checkpoints_many/Snakefile (1)
56-56:⚠️ Potential issueFix spacing in the get method parameter.
There's a syntax error with the extra space between the
**operator andwildcards. This would cause a syntax error when executing the workflow.- outputs_i = glob.glob(f"{checkpoints.first.get(** wildcards).output}/*/") + outputs_i = glob.glob(f"{checkpoints.first.get(**wildcards).output}/*/")
🧹 Nitpick comments (9)
.github/workflows/main.yml (2)
50-54: Review the Matrix Configuration Exclude Section and Related Comments.Within the testing job’s matrix configuration, an empty exclusion list is specified:
exclude: []Additionally, there is a commented-out dash shell configuration. If no exclusions are required, consider removing the
excludekey entirely for clarity. Also, static analysis identified trailing spaces on line 53—please remove these to comply with YAML formatting guidelines.🧰 Tools
🪛 actionlint (1.7.4)
52-52: "exclude" section should not be empty
(syntax-check)
🪛 YAMLlint (1.35.1)
[error] 53-53: trailing spaces
(trailing-spaces)
211-218: New Testing-Done Job: Final Output Notification and Formatting Note.The newly added testing-done job is a clear and concise way to notify that all tests have passed. Its conditional execution is consistent with the other jobs. However, static analysis indicates that there is no newline at the end of the file (line 218). Please add a newline for YAML compliance.
🧰 Tools
🪛 YAMLlint (1.35.1)
[error] 218-218: no new line character at the end of file
(new-line-at-end-of-file)
snakemake/workflow.py (1)
1813-1816: Consider using a more pythonic condition checkThe code is using an inequality comparison with
False, which is not the recommended Python style.-if not ruleinfo.template_engine and ruleinfo.container_img != False: +if not ruleinfo.template_engine and ruleinfo.container_img is not False:Alternative approach:
-if not ruleinfo.template_engine and ruleinfo.container_img != False: +if not ruleinfo.template_engine and ruleinfo.container_img:Note: The second approach assumes that
Falseis the only falsy value that should be treated differently from other falsy values (likeNoneor empty strings).🧰 Tools
🪛 Ruff (0.8.2)
1813-1813: Avoid inequality comparisons to
False; useif ruleinfo.container_img:for truth checksReplace with
ruleinfo.container_img(E712)
tests/test_checkpoints_many/Snakefile (2)
54-71: Consider adding error handling to the aggregate function.While the function includes an assertion to verify path existence, it could benefit from more robust error handling, especially for the glob operations which may return empty lists.
def aggregate(wildcards): outputs_i = glob.glob(f"{checkpoints.first.get(**wildcards).output}/*/") + if not outputs_i: + raise ValueError(f"No outputs found for first checkpoint with wildcards {wildcards}") outputs_i = [output.split("/")[-2] for output in outputs_i] split_files = [] for i in outputs_i: s2out = checkpoints.second.get(**wildcards, i=i).output[0] assert Path(s2out).exists() output_j = glob.glob(f"{s2out}/*/") + if not output_j: + raise ValueError(f"No outputs found for second checkpoint with wildcards {wildcards}, i={i}") outputs_j = [output.split("/")[-2] for output in output_j] for j in outputs_j: split_files.extend( expand(f"copy/{{sample}}/{i}/{j}/test2.txt", sample=wildcards.sample) ) + if not split_files: + raise ValueError(f"No files to aggregate for wildcards {wildcards}") return split_files
43-51: Consider using a more efficient copying mechanism.For copying files, using Python's shutil might be more cross-platform compatible than relying on the shell cp command.
rule copy: input: "second/{sample}/{i}/{j}/test2.txt", output: "copy/{sample}/{i}/{j}/test2.txt", - shell: - """ - cp -f {input} {output} - """ + run: + import shutil + Path(output[0]).parent.mkdir(parents=True, exist_ok=True) + shutil.copy2(input[0], output[0])snakemake/utils.py (4)
132-134: Use exception chaining for clarity.When re-raising an exception in an
exceptblock, you can usefrom eto clearly chain the original error:- raise WorkflowError(f"Error validating row {i} of data frame.", e) + raise WorkflowError(f"Error validating row {i} of data frame.") from e🧰 Tools
🪛 Ruff (0.8.2)
132-134: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
168-170: Use exception chaining here as well.Similar to the previous suggestion, applying explicit exception chaining helps clarify error origins:
- raise WorkflowError(f"Error validating row {i} of data frame.", e) + raise WorkflowError(f"Error validating row {i} of data frame.") from e🧰 Tools
🪛 Ruff (0.8.2)
168-170: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
206-208: Apply exception chaining for completeness.Following the same pattern ensures consistent error propagation and traceability:
- raise WorkflowError(f"Error validating row {i} of data frame.", e) + raise WorkflowError(f"Error validating row {i} of data frame.") from e🧰 Tools
🪛 Ruff (0.8.2)
206-208: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
224-224: Consider explicit exception chaining for config file error.Likewise, you can chain the original exception to retain context:
- raise WorkflowError("Error validating config file.", e) + raise WorkflowError("Error validating config file.") from e🧰 Tools
🪛 Ruff (0.8.2)
224-224: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (7)
CHANGELOG.mdis excluded by!CHANGELOG.mdpyproject.tomlis excluded by!pyproject.tomltests/test_script_xsh/expected-results/test.outis excluded by!**/*.outCHANGELOG.mdis excluded by!CHANGELOG.mdpyproject.tomlis excluded by!pyproject.tomltests/test_validate/samples.tsvis excluded by!**/*.tsvtests/test_validate/samples.tsvis excluded by!**/*.tsv
📒 Files selected for processing (72)
snakemake/checkpoints.py(4 hunks)snakemake/dag.py(0 hunks)tests/test_checkpoints_many/Snakefile(1 hunks)tests/tests.py(1 hunks)snakemake/checkpoints.py(2 hunks)tests/test_checkpoints_many/Snakefile(1 hunks)tests/test_checkpoints_many/Snakefile(1 hunks).github/workflows/codespell.yml(1 hunks).github/workflows/docs.yml(2 hunks).github/workflows/main.yml(6 hunks)docs/snakefiles/deployment.rst(2 hunks)docs/snakefiles/reporting.rst(5 hunks)docs/snakefiles/rules.rst(3 hunks)setup.py(1 hunks)snakemake/cli.py(2 hunks)snakemake/ioutils/__init__.py(2 hunks)snakemake/ioutils/input.py(1 hunks)snakemake/script/__init__.py(3 hunks)snakemake/workflow.py(2 hunks)test-environment.yml(2 hunks)tests/test_conda_python_3_7_script/Snakefile(1 hunks)tests/test_conda_python_3_7_script/test_script_python_3_7.py(1 hunks)tests/test_conda_run/Snakefile(1 hunks)tests/test_conda_run/expected-results/test.txt(1 hunks)tests/test_conda_run/test_python_env.yaml(1 hunks)tests/test_conda_run/test_script_run.py(1 hunks)tests/test_ioutils/Snakefile(4 hunks)tests/test_ioutils/expected-results/c/1.txt(1 hunks)tests/test_ioutils/expected-results/results/switch~someswitch.column~sample.txt(1 hunks)tests/test_ioutils/samples.md5(1 hunks)tests/test_script_xsh/Snakefile(1 hunks)tests/test_script_xsh/envs/xonsh.yaml(1 hunks)tests/test_script_xsh/scripts/test.xsh(1 hunks)tests/tests_using_conda.py(1 hunks)snakemake/checkpoints.py(1 hunks)snakemake/dag.py(0 hunks)tests/tests.py(1 hunks).github/workflows/codespell.yml(1 hunks).github/workflows/docs.yml(2 hunks).github/workflows/main.yml(6 hunks)docs/project_info/codebase.rst(1 hunks)docs/project_info/contributing.rst(2 hunks)docs/snakefiles/configuration.rst(1 hunks)docs/snakefiles/rules.rst(2 hunks)setup.py(1 hunks)snakemake/assets/__init__.py(4 hunks)snakemake/cli.py(1 hunks)snakemake/dag.py(1 hunks)snakemake/ioutils/__init__.py(2 hunks)snakemake/ioutils/input.py(1 hunks)snakemake/report/html_reporter/data/packages.py(1 hunks)snakemake/utils.py(1 hunks)snakemake/workflow.py(1 hunks)test-environment.yml(2 hunks)tests/test_ioutils/Snakefile(4 hunks)tests/test_ioutils/expected-results/c/1.txt(1 hunks)tests/test_ioutils/expected-results/results/switch~someswitch.column~sample.txt(1 hunks)tests/test_ioutils/samples.md5(1 hunks)tests/test_validate/Snakefile(1 hunks)tests/test_validate/samples.schema.yaml(1 hunks)docs/project_info/codebase.rst(1 hunks)docs/project_info/contributing.rst(2 hunks)docs/snakefiles/configuration.rst(1 hunks)snakemake/assets/__init__.py(4 hunks)snakemake/checkpoints.py(0 hunks)snakemake/dag.py(2 hunks)snakemake/report/html_reporter/data/packages.py(1 hunks)snakemake/utils.py(1 hunks)tests/test_issue1092/Snakefile(1 hunks)tests/test_validate/Snakefile(1 hunks)tests/test_validate/samples.schema.yaml(1 hunks)tests/test_issue1092/Snakefile(1 hunks)
✅ Files skipped from review due to trivial changes (7)
- tests/test_conda_run/test_script_run.py
- tests/test_script_xsh/envs/xonsh.yaml
- tests/test_script_xsh/scripts/test.xsh
- tests/test_conda_run/expected-results/test.txt
- tests/test_conda_run/test_python_env.yaml
- tests/test_conda_python_3_7_script/test_script_python_3_7.py
- tests/test_checkpoints_many/Snakefile
🚧 Files skipped from review as they are similar to previous changes (35)
- docs/project_info/codebase.rst
- tests/test_ioutils/expected-results/c/1.txt
- .github/workflows/codespell.yml
- docs/project_info/codebase.rst
- .github/workflows/codespell.yml
- tests/test_ioutils/expected-results/results/switch
someswitch.columnsample.txt - .github/workflows/docs.yml
- tests/test_ioutils/expected-results/results/switch
someswitch.columnsample.txt - setup.py
- docs/snakefiles/configuration.rst
- tests/test_ioutils/expected-results/c/1.txt
- snakemake/checkpoints.py
- snakemake/cli.py
- tests/tests.py
- docs/snakefiles/configuration.rst
- .github/workflows/docs.yml
- setup.py
- snakemake/workflow.py
- snakemake/cli.py
- test-environment.yml
- snakemake/dag.py
- snakemake/ioutils/init.py
- snakemake/ioutils/init.py
- tests/tests.py
- tests/test_ioutils/samples.md5
- docs/project_info/contributing.rst
- tests/test_validate/samples.schema.yaml
- snakemake/dag.py
- snakemake/dag.py
- snakemake/checkpoints.py
- .github/workflows/main.yml
- test-environment.yml
- tests/test_validate/samples.schema.yaml
- tests/test_ioutils/samples.md5
- snakemake/ioutils/input.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
tests/tests_using_conda.pysnakemake/dag.pysnakemake/utils.pysnakemake/workflow.pysnakemake/checkpoints.pysnakemake/script/__init__.pysnakemake/assets/__init__.pysnakemake/report/html_reporter/data/packages.pysnakemake/ioutils/input.py
🪛 GitHub Actions: CI
snakemake/dag.py
[error] 272-272: Black formatting check failed. 1 file would be reformatted. Please run 'black' to fix code style issues in this file.
🪛 Ruff (0.8.2)
snakemake/utils.py
132-134: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
168-170: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
206-208: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
224-224: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
132-134: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
168-170: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
206-208: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
224-224: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
snakemake/workflow.py
1813-1813: Avoid inequality comparisons to False; use if ruleinfo.container_img: for truth checks
Replace with ruleinfo.container_img
(E712)
snakemake/ioutils/input.py
16-16: Do not assign a lambda expression, use a def
Rewrite fix_file_name as a def
(E731)
30-30: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
30-30: Undefined name WorkflowError
(F821)
🪛 actionlint (1.7.4)
.github/workflows/main.yml
52-52: "exclude" section should not be empty
(syntax-check)
🪛 YAMLlint (1.35.1)
.github/workflows/main.yml
[error] 53-53: trailing spaces
(trailing-spaces)
[error] 218-218: no new line character at the end of file
(new-line-at-end-of-file)
🔇 Additional comments (57)
.github/workflows/main.yml (4)
15-16: Ensure Consistent Conditional Execution for CI Jobs.The condition
if: github.event.pull_request.merged != true || github.ref != 'refs/heads/main'in the formatting job correctly restricts execution when the pull request has been merged or when running on the main branch. Please verify that this logic fully aligns with your intended CI trigger scenarios.
41-42: Revalidate the Conditional Check in the Testing Job.The testing job now uses the same conditional as the formatting job. Confirm that this condition properly prevents job execution on merged PRs or on the main branch, as intended by your PR objectives.
156-157: Validate Conditional Execution in the Build-Container Image Job.The build-container-image job now also includes:
if: github.event.pull_request.merged != true || github.ref != 'refs/heads/main'This is consistent with the approach used in other jobs. Please confirm that this condition is correctly applied for your CI needs.
166-167: Confirm the Conditional Check in the Testing-Windows Job.The testing-windows job leverages the same conditional check. Ensure that this condition maintains the desired behavior across all platforms (in this case, Windows).
tests/test_conda_python_3_7_script/Snakefile (1)
7-7: Script file path update looks goodThe script file reference has been updated from "test_script.py" to "test_script_python_3_7.py", which appears to be providing a more specific Python 3.7 version of the test script. This change matches the testing approach that's being used in the other test files being added/modified in this PR.
tests/test_script_xsh/Snakefile (1)
1-12: New test workflow for xonsh script execution looks goodThis new Snakefile creates a proper test workflow with clearly defined input/output dependencies. The file contains:
- An "all" rule that establishes the final expected output
- A "test_xonsh" rule that uses a conda environment and executes a xonsh script
The structure follows Snakemake best practices by clearly specifying the output and using the script directive to run the xonsh script within the conda environment defined in envs/xonsh.yaml.
tests/test_conda_run/Snakefile (1)
1-13: New test for conda script execution with explanatory commentsThis new rule definition provides a good test case for running Python scripts with conda environments. The informative comments (lines 9-12) are particularly helpful for users, clarifying that this approach is only for testing purposes and that the script directive would normally be preferred in real workflows.
The pattern used here is valuable for testing the direct shell execution of Python scripts within conda environments, which appears to be a different execution path from the script directive approach.
tests/tests_using_conda.py (2)
309-315: New test for xonsh script execution looks goodThis test function properly uses the appropriate decorators to:
- Skip on Windows (since xonsh scripts may have platform-specific behavior)
- Use the conda deployment method
The test definition is consistent with other similar test functions in the file, making it easy to understand and maintain.
318-320: New test for conda run functionality looks goodThis test function correctly implements a test for the conda run functionality. Unlike the xonsh test, this one doesn't have the @skip_on_windows decorator, suggesting that this functionality should work across platforms.
The implementation is clean and follows the same pattern as other similar test functions in the file.
snakemake/script/__init__.py (4)
1600-1603: Well-structured Xonsh script execution implementation.This implementation extends support for Xonsh scripts by inheriting from PythonScript and properly overriding the execute_script method to use the xonsh command.
1667-1668: LGTM - Good extension recognition implementation.The detection logic for .xsh files follows the same pattern as other supported languages in the codebase.
1732-1732: LGTM - Proper registration of the XonshScript executor.The XonshScript class is correctly added to the executor dictionary, allowing it to handle files with the .xsh extension.
1736-1736: LGTM - Comprehensive error message update.The error message has been properly updated to include Xonsh (.xsh) in the list of supported script types.
snakemake/checkpoints.py (1)
33-35: Simple and effective approach for checking checkpoint outputsThis change implements the fix described in PR #3341, replacing the use of
future_outputwith a direct check of whether all outputs specified by the rule are created, making the logic more straightforward.snakemake/dag.py (3)
270-273: Method name change accurately reflects expanded functionalityThe renamed method
finished_and_not_needrun_checkpoint_jobsnow explicitly describes what it does: yield jobs that are checkpoint jobs and are either finished or not marked for running. This change is more precise than the previous name and matches the implementation.🧰 Tools
🪛 GitHub Actions: CI
[error] 272-272: Black formatting check failed. 1 file would be reformatted. Please run 'black' to fix code style issues in this file.
276-278: Update aligns checkpoint outputs with method changesThe
update_checkpoint_outputsmethod now correctly uses the renamed and modifiedfinished_and_not_needrun_checkpoint_jobsmethod, ensuring consistency in how checkpoint outputs are tracked throughout the DAG.
409-428:Details
❓ Verification inconclusive
Improved storage output handling with topological sorting
The
store_storage_outputsmethod has been significantly restructured to use a two-level loop that iterates over levels of jobs obtained fromtoposortedrather than a single loop overneedrun_jobs. This approach allows for more organized processing of jobs based on their dependency levels, ensuring that storage outputs are handled in the correct order.Let's verify that all outputs are still properly stored with the new implementation:
🏁 Script executed:
#!/bin/bash # Check for any references to the old future_output attribute in the codebase rg "future_output" --type pyLength of output: 28
Revised Storage Handling Implementation Verified – Please Confirm
The updated code in
snakemake/dag.py(lines 409–428) now processes jobs in dependency levels usingtoposortediterations. Our search for the legacyfuture_outputattribute yielded no results, suggesting that the refactoring has successfully removed outdated references. Please manually verify that all outputs (including benchmarks and job logs) continue to be stored correctly under the new asynchronous TaskGroup setup.
- Code now iterates over dependency levels from
toposorted(…)rather than using a single flat loop.- No occurrences of
future_outputwere found across the codebase.- Confirm proper storage of all outputs asynchronously under the new implementation.
docs/project_info/contributing.rst (2)
188-214: Documentation improvements align with Sphinx standardsThe section heading structure has been updated to follow the Sphinx recommendations, which provides a more consistent and standardized approach to documentation. This will improve readability and maintainability of the docs.
228-230: Environment name updated for documentation buildThe Conda environment name has been changed from
snakemaketosnakemake_docs, which better indicates the purpose of this environment and separates it from the general development environment.docs/snakefiles/reporting.rst (7)
7-10: Enhanced report description improves user understandingThe documentation now provides a clearer explanation of Snakemake's reporting capabilities, highlighting that reports contain runtime statistics, provenance information, and workflow topology. It also introduces the two main report types: self-contained HTML files for smaller reports and ZIP archives for more complex ones.
13-17: New section improves organization of report documentationAdding a dedicated section for "Including results in a report" with proper heading makes the documentation more structured and easier to navigate for users looking for specific information about report customization.
257-264: Added context about report data collectionThis new section clarifies where the report metadata comes from (the
.snakemakedirectory), which helps users better understand the reporting mechanism and troubleshoot potential issues.
265-285: New section on HTML reports provides clear usage instructionsThe dedicated section for self-contained HTML reports clearly explains how to generate them, including customizing the report filename. The warning about suitability only for smaller reports helps users choose the appropriate report format for their needs.
286-303: ZIP archive report documentation addresses scalabilityThis section provides valuable guidance on using ZIP archive reports for more complex workflows, clearly explaining the benefits and usage. The information about the main entry point (
report.html) is particularly helpful for users sharing reports with collaborators.
304-317: New section on partial reports enhances flexibilityDocumentation on partial reports is a valuable addition that explains how to generate reports for specific targets, which is useful for workflows that haven't completed or when exploring intermediate results.
318-331: Custom layout section enhances report customization optionsThe new section on custom layouts provides clear instructions on how to apply custom stylesheets to reports, including references to example files. This will help institutions create branded reports that match their visual identity.
tests/test_issue1092/Snakefile (1)
32-35: Code refactored to use direct shell directiveThe rule implementation has been refactored from using a
rundirective with ashell()function call to a directshell:directive. This change simplifies the implementation while maintaining the same behavior.snakemake/ioutils/input.py (1)
1-9: Simple but effective utility for parsing input filesThe
parse_inputfunction provides a clean way to parse input files with an optional parser function. It returns a closure that handles file opening and applies the parser when needed.tests/test_ioutils/Snakefile (4)
11-13: Added test for newextract_checksumfunctionGood test to verify the new
extract_checksumfunctionality works correctly by checking the expected checksum value for a file.
22-31: Updated rule to use the new checksum functionalityThe rule has been enhanced to use the new
extract_checksumfunctionality, demonstrating how to parse a checksum file and use it in a rule.
58-58: Added trailing comma for consistent syntaxAdding the trailing comma is a good practice for maintainability as it makes future additions easier without causing diff changes to the line.
85-85: Reformatted shell commandNo functional change, just reformatting.
snakemake/workflow.py (5)
1780-1788: Well-designed helper function to centralize software deployment checksGood refactoring to extract common logic for checking if software deployment methods are allowed with template engines into a reusable helper function.
1794-1796: Replaced inline check with helper function callGood refactoring to use the new helper function, improving code maintainability.
1800-1802: Replaced inline check with helper function callGood refactoring to use the new helper function, improving code maintainability.
1808-1810: Replaced inline check with helper function callGood refactoring to use the new helper function, improving code maintainability.
1293-1296: Added conditional execution for storage cleanupThis is the core fix of the PR that addresses the issue mentioned in the PR objective. Now it only cleans up storage objects if
keep_storage_localis False. This aligns with the goal of "discarding the use offuture_outputin favor of directly checking whether all outputs specified by the rule are created."tests/test_validate/Snakefile (6)
4-4: Good addition of Polars support.Adding the Polars import expands the testing framework to cover more dataframe libraries beyond just Pandas.
11-20: Well-structured validation tests for dictionary data.This new section properly tests dictionary-based validation with proper assertions for both complete data and data with null values filtered out. The code follows good practices by isolating test cases and asserting specific values.
22-27: Good addition of Pandas DataFrame validation without index.The code provides additional test coverage for validating Pandas DataFrames without explicit indexing, which is a common use case.
29-39: Well-implemented Polars DataFrame validation.The Polars DataFrame test case is a good addition covering schema definition, null value handling, and proper assertions. This enhances the test coverage for the validation functionality across different dataframe libraries.
41-49: Comprehensive Polars LazyFrame test case.This section adds testing for Polars LazyFrame, which is important for ensuring the validation works with lazy evaluation patterns. The
collect()call properly demonstrates how to handle lazy operations in validation contexts.
51-57: Good addition of index-based assertions.The additional assertions for the indexed DataFrame test case complete the test coverage for the various DataFrame formats and access patterns.
docs/snakefiles/deployment.rst (2)
288-293: Clear documentation update for conda with run directive.This documentation update properly clarifies that the
rundirective can use conda environments, with an important explanation of its special behavior. The note explains that the conda environment only affects shell calls within the run script, which is critical information for users.
464-467: Consistent documentation for apptainer with run directive.The update mirrors the conda documentation to consistently explain how the
rundirective interacts with containers. This maintains a coherent documentation style and helps users understand the parallel behaviors between conda and apptainer integrations.docs/snakefiles/rules.rst (2)
550-571: Documentation for newparse_inputfunction looks good.The added documentation clearly explains the purpose and usage of the
parse_inputfunction, which allows parsing an input file to extract values. The signature, parameters, and example usage are all well documented.
573-592: Documentation for newextract_checksumfunction is well structured.The
extract_checksumfunction is properly documented with clear explanation of its purpose, parameters, and usage example. The code sample showing how to apply it in a rule's params section is particularly helpful.tests/test_checkpoints_many/Snakefile (4)
21-30: Checkpoint implementation looks good.The
firstcheckpoint correctly implements directory output handling and creates a predictable structure with numbered subdirectories. This establishes a good foundation for testing the checkpoint functionality.
32-41: Good use of nested checkpoint dependency.The
secondcheckpoint properly builds upon the first checkpoint, creating an additional level of directories. This helps test how Snakemake handles deep checkpoint dependencies.
8-11: The rule all target is properly defined.The
allrule correctly specifies the final target that should trigger the execution of the entire workflow, using expand to handle all samples.
13-19: Good initialization rule for creating sample files.The
beforerule properly initializes the workflow by creating empty files for each sample, which is a clean way to set up prerequisites for subsequent rules.snakemake/utils.py (4)
111-117: Looks good for_validate_record.The logic to distinguish between using the default validator and the standard validator is clear and concise, with no detected issues.
118-149: Pandas DataFrame validation logic appears solid.Your approach of excluding null values, validating each record, and conditionally updating the DataFrame with default values is well-organized. The repeated creation of a record list, followed by reconstructing the DataFrame, cleanly handles updated columns.
🧰 Tools
🪛 Ruff (0.8.2)
132-134: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
150-218: Polars DataFrame validation logic is similarly robust.Great job mirroring the pandas strategy for null handling, record-based validation, partial checks for LazyFrame, and optional application of default values. This keeps the logic consistent across different DataFrame types.
🧰 Tools
🪛 Ruff (0.8.2)
168-170: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
206-208: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
219-226: Dictionary validation approach looks good.The fallback logic for dictionary validation is straightforward.
🧰 Tools
🪛 Ruff (0.8.2)
224-224: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
snakemake/assets/__init__.py (1)
96-530: Extensive asset references are well-handled.Each asset’s pinned version and SHA256 ensures reproducibility and explicit licensing coverage. No issues detected with the approach of referencing external resources.
snakemake/report/html_reporter/data/packages.py (1)
41-186: Package listings are methodical and consistent.All newly added packages properly reference their corresponding license files, preserving clear licensing provenance. This looks great.
|
🤖 I have created a release *beep* *boop* --- ## [9.0.0](v8.30.0...v9.0.0) (2025-03-14) ### ⚠ BREAKING CHANGES * Logging refactor & add LoggerPluginInterface ([#3107](#3107)) ### Features * [#3412](#3412) - keep shadow folder of failed job if --keep-incomplete flag is set. ([#3430](#3430)) ([22978c3](22978c3)) * add flag --report-after-run to automatically generate the report after a successfull workflow run ([#3428](#3428)) ([b0a7f03](b0a7f03)) * add flatten function to IO utils ([#3424](#3424)) ([67fa392](67fa392)) * add helper functions to parse input files ([#2918](#2918)) ([63e45a7](63e45a7)) * Add option to print redacted file names ([#3089](#3089)) ([ba4d264](ba4d264)) * add support for validation of polars dataframe and lazyframe ([#3262](#3262)) ([c7473a6](c7473a6)) * added support for rendering dag with mermaid js ([#3409](#3409)) ([7bf8381](7bf8381)) * adding --replace-workflow-config to fully replace workflow configs (from config: directive) with --configfile, instead of merging them ([#3381](#3381)) ([47504a0](47504a0)) * Dynamic module name ([#3401](#3401)) ([024dc32](024dc32)) * Enable saving and reloading IOCache object ([#3386](#3386)) ([c935953](c935953)) * files added in rule params with workflow.source_path will be available in used containers ([#3385](#3385)) ([a6e45bf](a6e45bf)) * Fix keep_local in storage directive and more freedom over remote retrieval behaviour ([#3410](#3410)) ([67b4739](67b4739)) * inherit parameters of use rule and extend/replace individual items them when using 'with' directive ([#3365](#3365)) ([93e4b92](93e4b92)) * Logging refactor & add LoggerPluginInterface ([#3107](#3107)) ([86f1d6e](86f1d6e)) * Maximal file size for checksums ([#3368](#3368)) ([b039f8a](b039f8a)) * Modernize package configuration using Pixi ([#3369](#3369)) ([77992d8](77992d8)) * multiext support for named input/output ([#3372](#3372)) ([05e1378](05e1378)) * optionally auto-group jobs via temp files in case of remote execution ([#3378](#3378)) ([cc9bba2](cc9bba2)) ### Bug Fixes * `--delete-all-output` ignores `--dry-run` ([#3265](#3265)) ([23fef82](23fef82)) * 3342 faster touch runs and warning messages for non-existing files ([#3398](#3398)) ([cd9c3c3](cd9c3c3)) * add default value to max-jobs-per-timespan ([#3043](#3043)) ([2959abe](2959abe)) * checkpoints inside modules are overwritten ([#3359](#3359)) ([fba3ac7](fba3ac7)) * Convert Path to IOFile ([#3405](#3405)) ([c58684c](c58684c)) * Do not perform storage object cleanup with --keep-storage-local-copies set ([#3358](#3358)) ([9a6d14b](9a6d14b)) * edgecases of source deployment in case of remote execution ([#3396](#3396)) ([5da13be](5da13be)) * enhance error message formatting for strict DAG-building mode ([#3376](#3376)) ([a1c39ee](a1c39ee)) * fix bug in checkpoint handling that led to exceptions in case checkpoint output was missing upon rerun ([#3423](#3423)) ([8cf4a2f](8cf4a2f)) * force check all required outputs ([#3341](#3341)) ([495a4e7](495a4e7)) * group job formatting ([#3442](#3442)) ([f0b10a3](f0b10a3)) * in remote jobs, upload storage in topological order such that modification dates are preserved (e.g. in case of group jobs) ([#3377](#3377)) ([eace08f](eace08f)) * only skip eval when resource depends on input ([#3374](#3374)) ([4574c92](4574c92)) * Prevent execution of conda in apptainer when not explicitly requested in software deployment method ([#3388](#3388)) ([c43c5c0](c43c5c0)) * print filenames with quotes around them in RuleException ([#3269](#3269)) ([6baeda5](6baeda5)) * Re-evaluation of free resources ([#3399](#3399)) ([6371293](6371293)) * ReadTheDocs layout issue due to src directory change ([#3419](#3419)) ([695b127](695b127)) * robustly escaping quotes in generated bash scripts (v2) ([#3297](#3297)) ([#3389](#3389)) ([58720bd](58720bd)) * Show apptainer image URL in snakemake report ([#3407](#3407)) ([45f0450](45f0450)) * Update ReadTheDocs configuration for documentation build to use Pixi ([#3433](#3433)) ([3f227a6](3f227a6)) ### Documentation * Add pixi setup instructions to general use tutorial ([#3382](#3382)) ([115e81b](115e81b)) * fix contribution section heading levels, fix docs testing setup order ([#3360](#3360)) ([051dc53](051dc53)) * fix link to github.com/snakemake/poetry-snakemake-plugin ([#3436](#3436)) ([ec6d97c](ec6d97c)) * fix quoting ([#3394](#3394)) ([b40f599](b40f599)) * fix rerun-triggers default ([#3403](#3403)) ([4430e23](4430e23)) * fix typo 'safe' -> 'save' ([#3384](#3384)) ([7755861](7755861)) * mention code formatting in the contribution section ([#3431](#3431)) ([e8682b7](e8682b7)) * remove duplicated 'functions'. ([#3356](#3356)) ([7c595db](7c595db)) * update broken links documentation ([#3437](#3437)) ([e3d0d88](e3d0d88)) * Updating contributing guidelines with new pixi dev setup ([#3415](#3415)) ([8e95a12](8e95a12)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: snakemake-bot <snakemake-bot-admin@googlegroups.com>
This may fix snakemake#3036 This fix discards totaly change `future_output` to `created_output`, and directly check if all output the rule wanted are created. I'm somehow doubt if `future_output` used some elsewhere. Is it needed to add it back? ### QC <!-- Make sure that you can tick the boxes below. --> * [x] The PR contains test case <`tests/tests.py::test_checkpoints_many`> for the changes. * [x] the change does neither modify the language nor the behavior or functionalities of Snakemake. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced workflow execution with refined job ordering, checkpoint validation, file parsing, checksum extraction, and updated CLI help text. - **Documentation** - Updated examples and guidelines for input parsing and configuration validations, with typographical corrections for improved clarity. - **Tests** - Introduced new test cases for checkpoint and data format validation, and updated expected outputs and schema definitions. - **Chores** - Revised CI workflows with conditional execution and updated environment dependencies to streamline the build process. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Johannes Köster <johannes.koester@uni-due.de> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: Johannes Köster <johannes.koester@tu-dortmund.de>
🤖 I have created a release *beep* *boop* --- ## [9.0.0](snakemake/snakemake@v8.30.0...v9.0.0) (2025-03-14) ### ⚠ BREAKING CHANGES * Logging refactor & add LoggerPluginInterface ([snakemake#3107](snakemake#3107)) ### Features * [snakemake#3412](snakemake#3412) - keep shadow folder of failed job if --keep-incomplete flag is set. ([snakemake#3430](snakemake#3430)) ([22978c3](snakemake@22978c3)) * add flag --report-after-run to automatically generate the report after a successfull workflow run ([snakemake#3428](snakemake#3428)) ([b0a7f03](snakemake@b0a7f03)) * add flatten function to IO utils ([snakemake#3424](snakemake#3424)) ([67fa392](snakemake@67fa392)) * add helper functions to parse input files ([snakemake#2918](snakemake#2918)) ([63e45a7](snakemake@63e45a7)) * Add option to print redacted file names ([snakemake#3089](snakemake#3089)) ([ba4d264](snakemake@ba4d264)) * add support for validation of polars dataframe and lazyframe ([snakemake#3262](snakemake#3262)) ([c7473a6](snakemake@c7473a6)) * added support for rendering dag with mermaid js ([snakemake#3409](snakemake#3409)) ([7bf8381](snakemake@7bf8381)) * adding --replace-workflow-config to fully replace workflow configs (from config: directive) with --configfile, instead of merging them ([snakemake#3381](snakemake#3381)) ([47504a0](snakemake@47504a0)) * Dynamic module name ([snakemake#3401](snakemake#3401)) ([024dc32](snakemake@024dc32)) * Enable saving and reloading IOCache object ([snakemake#3386](snakemake#3386)) ([c935953](snakemake@c935953)) * files added in rule params with workflow.source_path will be available in used containers ([snakemake#3385](snakemake#3385)) ([a6e45bf](snakemake@a6e45bf)) * Fix keep_local in storage directive and more freedom over remote retrieval behaviour ([snakemake#3410](snakemake#3410)) ([67b4739](snakemake@67b4739)) * inherit parameters of use rule and extend/replace individual items them when using 'with' directive ([snakemake#3365](snakemake#3365)) ([93e4b92](snakemake@93e4b92)) * Logging refactor & add LoggerPluginInterface ([snakemake#3107](snakemake#3107)) ([86f1d6e](snakemake@86f1d6e)) * Maximal file size for checksums ([snakemake#3368](snakemake#3368)) ([b039f8a](snakemake@b039f8a)) * Modernize package configuration using Pixi ([snakemake#3369](snakemake#3369)) ([77992d8](snakemake@77992d8)) * multiext support for named input/output ([snakemake#3372](snakemake#3372)) ([05e1378](snakemake@05e1378)) * optionally auto-group jobs via temp files in case of remote execution ([snakemake#3378](snakemake#3378)) ([cc9bba2](snakemake@cc9bba2)) ### Bug Fixes * `--delete-all-output` ignores `--dry-run` ([snakemake#3265](snakemake#3265)) ([23fef82](snakemake@23fef82)) * 3342 faster touch runs and warning messages for non-existing files ([snakemake#3398](snakemake#3398)) ([cd9c3c3](snakemake@cd9c3c3)) * add default value to max-jobs-per-timespan ([snakemake#3043](snakemake#3043)) ([2959abe](snakemake@2959abe)) * checkpoints inside modules are overwritten ([snakemake#3359](snakemake#3359)) ([fba3ac7](snakemake@fba3ac7)) * Convert Path to IOFile ([snakemake#3405](snakemake#3405)) ([c58684c](snakemake@c58684c)) * Do not perform storage object cleanup with --keep-storage-local-copies set ([snakemake#3358](snakemake#3358)) ([9a6d14b](snakemake@9a6d14b)) * edgecases of source deployment in case of remote execution ([snakemake#3396](snakemake#3396)) ([5da13be](snakemake@5da13be)) * enhance error message formatting for strict DAG-building mode ([snakemake#3376](snakemake#3376)) ([a1c39ee](snakemake@a1c39ee)) * fix bug in checkpoint handling that led to exceptions in case checkpoint output was missing upon rerun ([snakemake#3423](snakemake#3423)) ([8cf4a2f](snakemake@8cf4a2f)) * force check all required outputs ([snakemake#3341](snakemake#3341)) ([495a4e7](snakemake@495a4e7)) * group job formatting ([snakemake#3442](snakemake#3442)) ([f0b10a3](snakemake@f0b10a3)) * in remote jobs, upload storage in topological order such that modification dates are preserved (e.g. in case of group jobs) ([snakemake#3377](snakemake#3377)) ([eace08f](snakemake@eace08f)) * only skip eval when resource depends on input ([snakemake#3374](snakemake#3374)) ([4574c92](snakemake@4574c92)) * Prevent execution of conda in apptainer when not explicitly requested in software deployment method ([snakemake#3388](snakemake#3388)) ([c43c5c0](snakemake@c43c5c0)) * print filenames with quotes around them in RuleException ([snakemake#3269](snakemake#3269)) ([6baeda5](snakemake@6baeda5)) * Re-evaluation of free resources ([snakemake#3399](snakemake#3399)) ([6371293](snakemake@6371293)) * ReadTheDocs layout issue due to src directory change ([snakemake#3419](snakemake#3419)) ([695b127](snakemake@695b127)) * robustly escaping quotes in generated bash scripts (v2) ([snakemake#3297](snakemake#3297)) ([snakemake#3389](snakemake#3389)) ([58720bd](snakemake@58720bd)) * Show apptainer image URL in snakemake report ([snakemake#3407](snakemake#3407)) ([45f0450](snakemake@45f0450)) * Update ReadTheDocs configuration for documentation build to use Pixi ([snakemake#3433](snakemake#3433)) ([3f227a6](snakemake@3f227a6)) ### Documentation * Add pixi setup instructions to general use tutorial ([snakemake#3382](snakemake#3382)) ([115e81b](snakemake@115e81b)) * fix contribution section heading levels, fix docs testing setup order ([snakemake#3360](snakemake#3360)) ([051dc53](snakemake@051dc53)) * fix link to github.com/snakemake/poetry-snakemake-plugin ([snakemake#3436](snakemake#3436)) ([ec6d97c](snakemake@ec6d97c)) * fix quoting ([snakemake#3394](snakemake#3394)) ([b40f599](snakemake@b40f599)) * fix rerun-triggers default ([snakemake#3403](snakemake#3403)) ([4430e23](snakemake@4430e23)) * fix typo 'safe' -> 'save' ([snakemake#3384](snakemake#3384)) ([7755861](snakemake@7755861)) * mention code formatting in the contribution section ([snakemake#3431](snakemake#3431)) ([e8682b7](snakemake@e8682b7)) * remove duplicated 'functions'. ([snakemake#3356](snakemake#3356)) ([7c595db](snakemake@7c595db)) * update broken links documentation ([snakemake#3437](snakemake#3437)) ([e3d0d88](snakemake@e3d0d88)) * Updating contributing guidelines with new pixi dev setup ([snakemake#3415](snakemake#3415)) ([8e95a12](snakemake@8e95a12)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: snakemake-bot <snakemake-bot-admin@googlegroups.com>



This may fix #3036
This fix discards totaly change
future_outputtocreated_output, and directly check if all output the rule wanted are created.I'm somehow doubt if
future_outputused some elsewhere. Is it needed to add it back?QC
tests/tests.py::test_checkpoints_many> for the changes.Summary by CodeRabbit
New Features
xonshscripts in the workflow.Documentation
Tests
Chores