feat: Maximal file size for checksums by JZentgraf · Pull Request #3368 · snakemake/snakemake

JZentgraf · 2025-03-11T08:33:35Z

Add a new parameter --max-checksum-file-size to adjust the maximal file size for which a checksum is computed when the DAG is build.
This is related to #2949

QC

The PR contains a test case for the changes or the changes are already covered by an existing test case.
The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

Summary by CodeRabbit

New Features
- Introduced a command-line option to specify a maximum file size for checksum processing, defaulting to 1,000,000 bytes.
- Added a new configuration parameter for managing the maximum allowable size of checksum files.
Refactor
- Updated file validation and checksum routines to incorporate the new file size limit, enhancing processing efficiency.
- Made method signatures more explicit by removing default values for threshold parameters in checksum-related methods.

coderabbitai · 2025-03-11T08:33:47Z

📝 Walkthrough

Walkthrough

The changes integrate a new configuration parameter (max_checksum_file_size) into the Snakemake workflow. A new helper function and CLI argument enable users to specify a maximum file size (in bytes) for checksum computations. This value is propagated into the API call, the DAG initialization, asynchronous checksum methods, and persistence routines. Additionally, the default threshold values in several IO methods have been removed, requiring callers to explicitly provide the threshold.

Changes

File(s)	Change Summary
`snakemake/cli.py`	Added new function `parse_size_in_bytes`, introduced `--max-checksum-file-size` CLI argument, and passed `max_checksum_file_size` in the API call.
`snakemake/dag.py`, `snakemake/persistence.py`	Added `max_checksum_file_size` attribute (from DAG settings) to the DAG and Persistence classes; updated checksum evaluation calls to include this new parameter in asynchronous methods.
`snakemake/io.py`	Removed default values for the `threshold` parameter from methods (`is_checksum_eligible`, `checksum`, and `is_same_checksum`), now requiring explicit threshold values for checksum operations.
`snakemake/settings/types.py`	Added new attribute `max_checksum_file_size: int = 1000000` to the `DAGSettings` class.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant API
    participant Workflow
    participant DAG
    participant Persistence
    participant IO

    User->>CLI: Provide --max-checksum-file-size value
    CLI->>CLI: Call parse_size_in_bytes to convert value
    CLI->>API: Pass max_checksum_file_size in API call
    API->>Workflow: Forward configuration parameter
    Workflow->>DAG: Initialize DAG with max_checksum_file_size from DAGSettings
    DAG->>IO: Invoke checksum methods with explicit max_checksum_file_size threshold
    DAG->>Persistence: Use max_checksum_file_size in file evaluation
    Persistence->>IO: Call IO checksum functions with provided threshold

Possibly related PRs

perf: compare checksums of input files <= 1MB (before (10KB) #3267: Addresses performance improvements in checksum comparisons based on file size, closely related to introducing the max_checksum_file_size parameter.

Suggested reviewers

johanneskoester

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1526067 and 1696232.

📒 Files selected for processing (1)

snakemake/cli.py (3 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

snakemake/cli.py

⏰ Context from checks skipped due to timeout of 90000ms (30)

GitHub Check: testing (10, 3.12, bash)
GitHub Check: testing (10, 3.11, bash)
GitHub Check: testing (9, 3.12, bash)
GitHub Check: testing (9, 3.11, bash)
GitHub Check: testing (8, 3.12, bash)
GitHub Check: testing (8, 3.11, bash)
GitHub Check: testing (7, 3.12, bash)
GitHub Check: testing (7, 3.11, bash)
GitHub Check: testing (6, 3.12, bash)
GitHub Check: testing (6, 3.11, bash)
GitHub Check: testing (5, 3.12, bash)
GitHub Check: testing (5, 3.11, bash)
GitHub Check: testing-windows (10)
GitHub Check: testing (4, 3.12, bash)
GitHub Check: testing-windows (9)
GitHub Check: testing (4, 3.11, bash)
GitHub Check: testing-windows (8)
GitHub Check: testing (3, 3.12, bash)
GitHub Check: testing-windows (7)
GitHub Check: testing (3, 3.11, bash)
GitHub Check: testing-windows (6)
GitHub Check: testing (2, 3.12, bash)
GitHub Check: testing-windows (5)
GitHub Check: testing (2, 3.11, bash)
GitHub Check: testing-windows (4)
GitHub Check: testing (1, 3.12, bash)
GitHub Check: testing-windows (3)
GitHub Check: testing (1, 3.11, bash)
GitHub Check: testing-windows (2)
GitHub Check: testing-windows (1)

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (1)

snakemake/persistence.py (1)

1-787: ⚠️ Potential issue

Note: Black formatting issue detected.

The CI pipeline reports a Black formatting issue. While we don't need to fix formatting according to the coding guidelines, you should be aware of this issue.

🧰 Tools

🪛 Ruff (0.8.2)

272-272: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

279-281: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

538-538: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

546-546: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

551-551: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

571-571: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

609-609: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

618-618: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

663-663: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

673-673: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

🪛 GitHub Actions: CI

[error] 1-1: Black formatting check failed. 2 files would be reformatted, 140 files would be left unchanged.

🧹 Nitpick comments (5)

snakemake/workflow.py (1)
1812-1814: Follow Python's style guide for boolean comparisons

Replace direct comparison to False with a truthiness check to follow Python's style guide.
-if not ruleinfo.template_engine and ruleinfo.container_img != False:
+if not ruleinfo.template_engine and ruleinfo.container_img:
    # skip rules with template_engine directive or empty image
    rule.container_img = self.global_container_img
🧰 Tools

🪛 Ruff (0.8.2)

1812-1812: Avoid inequality comparisons to False; use if ruleinfo.container_img: for truth checks

Replace with ruleinfo.container_img

(E712)
tests/test_conda_run/test_script_run.py (1)
1-4: Consider adding error handling to this script.

The script currently has no error handling for potential issues like NumPy import failure or file operation errors. While this is a simple test script, adding basic error handling would make it more robust.
import numpy

-with open("test.txt", "w") as f:
-    f.write(str(numpy.log2(8)))
+try:
+    with open("test.txt", "w") as f:
+        f.write(str(numpy.log2(8)))
+except Exception as e:
+    print(f"Error: {e}")
+    exit(1)
docs/snakefiles/rules.rst (1)

1592-1611: Added Xonsh Script Integration Section

This newly added section introduces documentation on using Xonsh scripts (with the “.xsh” extension) within rules. The code block and accompanying link (https://xon.sh) clearly inform users that Xonsh, being a superset of Python, can be used similarly to a Python script while leveraging additional shell primitives.

Consider a minor wording adjustment: instead of “script:” maybe clarify that rules point to an Xonsh script file. Additionally, check that the formatting (indentation and spacing) is consistent with the surrounding snippets.

docs/snakefiles/deployment.rst (2)

288-294: Clarified Conda Environment Behavior with the run Directive

The updated note now clearly explains that conda environments are available for use with the shell, script, notebook, wrapper, and notably the run directive. It emphasizes that because a run block executes in the same process as Snakemake, the conda directive only affects shell function calls within that block. This clarification is very helpful for users who might otherwise expect full isolation even in run blocks.

A minor suggestion is to standardize phrasing by replacing “from the within run script” with “within the run block.”

464-468: Clarified Apptainer Integration with the run Directive

Similarly, the note regarding apptainer integration now specifies that when using the run directive, containerization only impacts shell function calls executed from inside the run block. This helps set the correct expectations for users leveraging both containerization and the run directive.

A slight wording improvement—using “within the run block” rather than “from within the run script”—could enhance overall clarity and consistency with the conda note.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d1c369b and 9476d7b.

⛔ Files ignored due to path filters (1)

tests/test_script_xsh/expected-results/test.out is excluded by !**/*.out

📒 Files selected for processing (23)

snakemake/cli.py (2 hunks)
snakemake/dag.py (4 hunks)
snakemake/io.py (2 hunks)
snakemake/settings/types.py (1 hunks)
snakemake/io.py (2 hunks)
snakemake/persistence.py (2 hunks)
snakemake/cli.py (1 hunks)
docs/snakefiles/deployment.rst (2 hunks)
docs/snakefiles/reporting.rst (5 hunks)
docs/snakefiles/rules.rst (1 hunks)
snakemake/cli.py (1 hunks)
snakemake/script/__init__.py (3 hunks)
snakemake/workflow.py (1 hunks)
tests/test_conda_python_3_7_script/Snakefile (1 hunks)
tests/test_conda_python_3_7_script/test_script_python_3_7.py (1 hunks)
tests/test_conda_run/Snakefile (1 hunks)
tests/test_conda_run/expected-results/test.txt (1 hunks)
tests/test_conda_run/test_python_env.yaml (1 hunks)
tests/test_conda_run/test_script_run.py (1 hunks)
tests/test_script_xsh/Snakefile (1 hunks)
tests/test_script_xsh/envs/xonsh.yaml (1 hunks)
tests/test_script_xsh/scripts/test.xsh (1 hunks)
tests/tests_using_conda.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

tests/test_conda_run/test_script_run.py
tests/test_conda_python_3_7_script/test_script_python_3_7.py
tests/tests_using_conda.py
snakemake/settings/types.py
snakemake/persistence.py
snakemake/cli.py
snakemake/workflow.py
snakemake/io.py
snakemake/dag.py
snakemake/script/__init__.py

🪛 GitHub Actions: CI

snakemake/persistence.py

[error] 1-1: Black formatting check failed. 2 files would be reformatted, 140 files would be left unchanged.

snakemake/dag.py

[error] 1-1: Black formatting check failed. 2 files would be reformatted, 140 files would be left unchanged.

🪛 Ruff (0.8.2)

snakemake/workflow.py

1812-1812: Avoid inequality comparisons to False; use if ruleinfo.container_img: for truth checks

Replace with ruleinfo.container_img

(E712)

🔇 Additional comments (36)

snakemake/workflow.py (4)

1779-1787: Good addition of centralized validation function

The introduction of the check_may_use_software_deployment function centralizes the validation logic for software deployment directives, improving code maintainability and reducing duplication.

1794-1794: Appropriate use of centralized validation

Good refactoring to replace inline checks with the new centralized validation function for environment modules.

1800-1800: Appropriate use of centralized validation

Good refactoring to replace inline checks with the new centralized validation function for conda environments.

1808-1808: Appropriate use of centralized validation

Good refactoring to replace inline checks with the new centralized validation function for container/singularity directives.

tests/test_conda_run/expected-results/test.txt (1)

1-1: New test expected result file looks good

This file contains the expected output value (3.0) for the test case that calculates the logarithm base 2 of 8 using numpy.

tests/test_script_xsh/scripts/test.xsh (1)

1-2: Good addition of Xonsh script for testing

The new Xonsh script correctly writes "Hello, world!" to the output file specified in the snakemake.output list. This expands test coverage to include Xonsh script execution capabilities.

tests/test_conda_python_3_7_script/test_script_python_3_7.py (1)

4-4: Minor style change from single to double quotes

The change from single quotes to double quotes is a style change that doesn't affect functionality. Both formats are valid for string literals in Python.

tests/test_script_xsh/envs/xonsh.yaml (1)

1-5: LGTM: Proper conda environment specification for xonsh.

The environment setup for xonsh includes the necessary channels and dependency required for testing xonsh script execution.

tests/test_conda_run/test_python_env.yaml (1)

1-6: LGTM: Appropriate conda environment for Python with NumPy.

The environment definition includes the necessary channels and dependencies for the Python test script that uses NumPy.

tests/test_conda_python_3_7_script/Snakefile (1)

7-7: LGTM: Appropriate script name update.

The updated script name better indicates the Python version specificity of the test script.
tests/test_script_xsh/Snakefile (1)

1-12:
Details

✅ Verification successful

LGTM: Well-formed Snakefile for testing xonsh script integration.

The Snakefile correctly defines a workflow to test the execution of xonsh scripts using conda environments.

One observation: Make sure the scripts/test.xsh file exists at this location, as this Snakefile depends on it but it wasn't included in the files to review.

🏁 Script executed:
#!/bin/bash
# Check if the xonsh script file exists
if [ -f "tests/test_script_xsh/scripts/test.xsh" ]; then
  echo "File exists: tests/test_script_xsh/scripts/test.xsh"
  cat "tests/test_script_xsh/scripts/test.xsh"
else
  echo "File not found: tests/test_script_xsh/scripts/test.xsh"
fi
Length of output: 255
File existence confirmed for xonsh script dependency.

The Snakefile is set up correctly for testing the xonsh integration using a conda environment. We verified that the referenced script exists at the expected relative path (i.e., within tests/test_script_xsh/scripts/test.xsh), and its content is as anticipated. No further changes are required.
snakemake/settings/types.py (1)

205-205: Good addition of the max_checksum_file_size parameter.

The new parameter introduces a useful limit on file sizes for checksum computation, which aligns with the PR objective to allow specifying a maximum file size for checksums during DAG construction. The default value of 1MB seems reasonable.

tests/test_conda_run/Snakefile (1)

1-13: Well-structured test rule with helpful comments.

The rule effectively tests Conda integration for running Python scripts. The comments clearly explain that this approach is for testing purposes and provide guidance about using the script directive in actual workflows.

tests/tests_using_conda.py (2)

309-316: Good test coverage for Xonsh script execution.

This test appropriately handles platform specificity with the @skip_on_windows decorator and follows the established testing pattern in the file.

318-321: Test aligns with PR objectives for Conda run functionality.

This test function appropriately tests the new Conda run feature and follows the established pattern of other test functions in the file.

snakemake/persistence.py (2)

133-133: Appropriate integration of max_checksum_file_size parameter.

The new parameter is correctly initialized from the DAG settings, ensuring consistency across the codebase.

330-330: Proper usage of max_checksum_file_size in checksum calculation.

The parameter is correctly passed to the checksum method for input files, implementing the file size limit functionality.

snakemake/cli.py (2)

957-965: Enhanced report documentation looks good!

The improved help text for the --report option provides clearer guidance about report generation options, explaining the difference between embedded HTML reports for smaller datasets and ZIP archives for larger ones.

1303-1309: New parameter implementation for maximum checksum file size works as expected.

This new parameter implements the core feature mentioned in the PR objectives. Setting a 1MB default for maximum file size for checksum computation is a reasonable choice that balances performance and reliability.

snakemake/dag.py (4)

147-147: Good addition of the max_checksum_file_size parameter.

Initializing this attribute from the workflow dag settings allows for user configuration of maximum file size for checksum verification, which will improve performance when working with large files.

650-650: Updated checksum verification to respect file size limits.

This change properly passes the max_checksum_file_size parameter to the checksum verification while maintaining the force=True parameter, ensuring backward compatibility while adding the new file size limit feature.

1231-1231: Enhanced checksum eligibility check with file size limit.

The code now correctly uses the new max_checksum_file_size parameter to determine if a file is eligible for checksum calculation, which will prevent unnecessary checksum calculations for large files.

1245-1245: Consistent implementation of the file size limit for checksums.

The is_same_checksum call now includes the max_checksum_file_size parameter, maintaining consistency with the other checksum verification calls in the codebase.

snakemake/script/__init__.py (4)

1600-1602: Well-implemented Xonsh script support.

The new XonshScript class properly inherits from PythonScript and appropriately overrides just the execute_script method to use the Xonsh shell instead of Python for execution. This implementation is clean and follows the established pattern used by other script type handlers in the codebase.

1667-1668: Good extension of language detection.

The extension of the get_language function to recognize .xsh files maintains consistency with how other file extensions are detected. This change properly integrates with the existing language detection logic.

1732-1732: Correctly registered XonshScript executor.

The Xonsh script type is properly registered in the exec_class dictionary, allowing the correct executor to be used for Xonsh scripts.

1736-1737: Well-updated error message.

The error message is appropriately updated to include Xonsh as a supported script type, maintaining user-friendly error reporting.

snakemake/io.py (3)

599-605: Proper removal of default threshold parameter.

The default value for the threshold parameter was removed from is_checksum_eligible, which aligns with the PR's objective to make this value configurable through the max_checksum_file_size parameter. This change ensures that threshold values must be explicitly provided, improving configurability.

607-624: Consistent threshold parameter handling.

The default value for the threshold parameter was properly removed from the checksum method as well, maintaining consistency with the changes to is_checksum_eligible. The method documentation is still accurate and clearly explains the purpose of the parameters.

626-632: Completed the threshold parameter standardization.

The default value for the threshold parameter was also removed from is_same_checksum, completing the standardization of threshold parameter handling across all checksum-related methods. This ensures consistent and explicit threshold specification.

docs/snakefiles/reporting.rst (6)

7-10: Improved report introduction with clarity on output formats.

The enhanced introduction provides a clearer explanation of what reports include (runtime statistics, provenance information, workflow topology) and introduces the distinction between single HTML files for smaller reports and ZIP archives for more complex ones. This helps users understand the reporting capabilities better.

13-17: Well-structured section heading for report inclusion.

Adding a dedicated section with proper ReStructuredText anchors for "Including results in a report" improves the document organization and makes it easier to link to this specific section from elsewhere in the documentation.

257-286: Clear instructions for HTML report generation.

This new section provides detailed and clear instructions for generating self-contained HTML reports, including how to specify custom report names. The note about HTML files being suitable only for smaller reports is important guidance for users to set appropriate expectations.

287-303: Comprehensive explanation of ZIP archive reports.

The addition of instructions for generating self-contained ZIP archive reports is valuable, especially the explanation of when to use this format (for complex reports with many outputs) and how to navigate the resulting archive. This will help users choose the appropriate report format for their needs.

304-317: Useful information on partial reports.

The new section on partial reports explains how to generate reports for specific targets, which is particularly useful for large workflows or those with multiple alternative targets. The explanation of use cases helps users understand when this feature might be beneficial.

318-330: Well-documented custom layout capabilities.

The section on custom layouts provides clear instructions for using custom stylesheets, with helpful examples and downloadable resources. This enables users to brand their reports appropriately, which is especially valuable for institutional use.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

test-environment.yml (2)

7-7: Clarify the setuptools Dependency Version

The new inclusion of setuptools in the top-level dependencies is appropriate for ensuring that build and installation processes have the necessary tooling. Consider specifying a version constraint if your project relies on features or compatibility fixes from a specific version.

61-62: Elevate cwltool and cwl-utils to Top-Level Dependencies

Moving cwltool and cwl-utils from the nested pip section to the top-level dependencies enhances consistency and can simplify dependency resolution. Please verify that these packages are available in the Conda channels (e.g., conda-forge or bioconda) as expected. This change aligns well with the overall objective of standardizing dependency management.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 117ed5a and e4f92ce.

⛔ Files ignored due to path filters (1)

pyproject.toml is excluded by !pyproject.toml

📒 Files selected for processing (1)

test-environment.yml (2 hunks)

johanneskoester

Thanks a lot! One minor thing below.

snakemake/cli.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

snakemake/workflow.py (1)
1812-1813: Avoid inequality comparison to False

According to Python style conventions, it's better to use truthiness checks rather than explicit comparisons to False.
-                if not ruleinfo.template_engine and ruleinfo.container_img != False:
+                if not ruleinfo.template_engine and ruleinfo.container_img is not False:
Using is not False is more appropriate here since you're specifically checking for the False constant rather than falsy values in general.

🧰 Tools

🪛 Ruff (0.8.2)

1813-1813: Avoid inequality comparisons to False; use if ruleinfo.container_img: for truth checks

Replace with ruleinfo.container_img

(E712)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9cb22bb and 90150c8.

📒 Files selected for processing (8)

snakemake/cli.py (2 hunks)
snakemake/cli.py (1 hunks)
snakemake/cli.py (1 hunks)
snakemake/workflow.py (1 hunks)
snakemake/cli.py (2 hunks)
snakemake/cli.py (1 hunks)
snakemake/cli.py (1 hunks)
snakemake/workflow.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (5)

snakemake/cli.py
snakemake/cli.py
snakemake/cli.py
snakemake/cli.py
snakemake/cli.py

🧰 Additional context used

📓 Path-based instructions (1)

`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

snakemake/workflow.py
snakemake/cli.py

🪛 Ruff (0.8.2)

snakemake/workflow.py

1813-1813: Avoid inequality comparisons to False; use if ruleinfo.container_img: for truth checks

Replace with ruleinfo.container_img

(E712)

🔇 Additional comments (6)

snakemake/cli.py (2)

73-76: Excellent implementation of the human-friendly size parser

The function correctly imports and uses the parse_size function from the humanfriendly library, allowing users to specify sizes in a more readable format (e.g., "1GB", "500MB").

1308-1314: Good enhancement to the argument parser for file size specification

Replacing type=int with parse_func=parse_size_in_bytes improves user experience by allowing human-readable size formats (like "100MB" or "1GB") instead of requiring raw byte counts.

Note that this implements johanneskoester's suggestion from the previous review to use the humanfriendly package for parsing size values.

snakemake/workflow.py (4)

1779-1787: Well-designed refactoring of validation logic

The new check_may_use_software_deployment function nicely centralizes the logic for validating whether a rule can use specific software deployment methods. This improves maintainability by reducing code duplication and ensuring consistent error messages.

1794-1794: Clean code refactoring using the new helper function

Good usage of the newly created validation function for the envmodules directive.

1800-1800: Clean code refactoring using the new helper function

Good usage of the newly created validation function for the conda directive.

1808-1808: Clean code refactoring using the new helper function

Good usage of the newly created validation function for the container/singularity directive.

coderabbitai

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🛑 Comments failed to post (1)

snakemake/ioutils/input.py (1)
12-30: ⚠️ Potential issue

Fix import error handling and lambda usage in the extract_checksum function.

There are several issues that need to be addressed:

WorkflowError is used but not imported

Lambda function usage where a regular function is preferred

Exception handling doesn't preserve the original exception context

Apply these changes:
+from snakemake.exceptions import WorkflowError

 def extract_checksum(infile, **kwargs):
     try:
         import pandas as pd
 
-        fix_file_name = lambda x: x.removeprefix("./")
+        def fix_file_name(x):
+            return x.removeprefix("./")
+            
         return (
             pd.read_csv(
                 infile,
                 sep="  ",
                 header=None,
                 engine="python",
                 converters={1: fix_file_name},
             )
             .set_index(1)
             .loc[fix_file_name(kwargs.get("file"))]
             .item()
         )
-    except ImportError:
-        raise WorkflowError("Pandas is required to extract checksum from file.")
+    except ImportError as e:
+        raise WorkflowError("Pandas is required to extract checksum from file.") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from snakemake.exceptions import WorkflowError

def extract_checksum(infile, **kwargs):
    try:
        import pandas as pd

        def fix_file_name(x):
            return x.removeprefix("./")
            
        return (
            pd.read_csv(
                infile,
                sep="  ",
                header=None,
                engine="python",
                converters={1: fix_file_name},
            )
            .set_index(1)
            .loc[fix_file_name(kwargs.get("file"))]
            .item()
        )
    except ImportError as e:
        raise WorkflowError("Pandas is required to extract checksum from file.") from e
🧰 Tools

🪛 Ruff (0.8.2)

16-16: Do not assign a lambda expression, use a def

Rewrite fix_file_name as a def

(E731)

30-30: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

30-30: Undefined name WorkflowError

(F821)

coderabbitai

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🧹 Nitpick comments (11)

snakemake/utils.py (4)
130-134: Consider chaining the raised exception.

Use exception chaining to improve tracebacks, for example:
-    raise WorkflowError(
-        f"Error validating row {i} of data frame.", e
-    )
+    raise WorkflowError(
+        f"Error validating row {i} of data frame."
+    ) from e
🧰 Tools

🪛 Ruff (0.8.2)

132-134: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

166-170: Consider chaining the raised exception.

Same rationale as above; use from e to trace the original cause.

🧰 Tools

🪛 Ruff (0.8.2)

168-170: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

205-208: Consider exception chaining for better tracebacks.

Apply the same solution with from e in this block.

🧰 Tools

🪛 Ruff (0.8.2)

206-208: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

222-224: Chaining the exception clarifies error origins.

Use raise WorkflowError("...", e) from e or similar to chain the original exception.

🧰 Tools

🪛 Ruff (0.8.2)

224-224: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)
.github/workflows/main.yml (2)

51-54: Address empty exclude and trailing spaces.

Consider removing the empty exclude: [] block if not needed, and eliminate trailing whitespace on line 53.

🧰 Tools

🪛 actionlint (1.7.4)

52-52: "exclude" section should not be empty

(syntax-check)

🪛 YAMLlint (1.35.1)

[error] 53-53: trailing spaces

(trailing-spaces)

210-218: Add a newline at the end of file.

The YAML linter flags a missing newline at line 218. Also, the new testing-done job logic looks solid.

🧰 Tools

🪛 YAMLlint (1.35.1)

[error] 218-218: no new line character at the end of file

(new-line-at-end-of-file)

tests/test_validate/samples.schema.yaml (1)

9-17: Consider stricter validation rules.
Currently, n defaults to 0, and tissue defaults to "blood." If these values have meaningful constraints (e.g., n >= 1 or a limited set of valid tissues), adding constraints (e.g., minimum, enum) could improve data integrity. Otherwise, 0 or "blood" might silently hide errors.

snakemake/assets/__init__.py (2)

96-351: Consider centralizing asset definitions.
A large block of asset references has been added for Vega and its dependencies. Storing them in a separate data structure or file could reduce duplication (e.g., repeated SHA256 values) and simplify updates.

510-530: Check for potential future updates in prop-types dependencies.
Though many tree-shaken dependencies are not included in prop-types.min.js, maintaining these references in code may complicate updates. Consider a system to dynamically fetch license references or store them in a standalone configuration file for quicker updates.

tests/test_validate/Snakefile (2)

11-27: Reduce duplication by validating more rows programmatically.
Lines 11–21 and 22–27 both grab the first two rows in different DF contexts. If your schema applies to all rows, consider validating every row or the entire DataFrame once, rather than duplicating similar logic for each row index.

41-50: Confirm LazyFrame behavior with schema validation.
You are validating the LazyFrame with set_default=False prior to calling .collect(). Ensure that any default values or schema-based transformations occur as intended. If not, calling validate() after materializing the data might be more accurate.

🛑 Comments failed to post (1)

tests/test_validate/Snakefile (1)

55-57: 🛠️ Refactor suggestion

Consolidate index-based validations.
Lines 55–57 repeat checks done earlier on the first two rows. If the entire DataFrame must conform, consider applying a single validation pass for all rows or unify index-based checks into a helper function to reduce repetition.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

snakemake/cli.py (1)
1317-1324: Good implementation of the new configuration parameter

The addition of the --max-checksum-file-size parameter is well-implemented with a clear help message, appropriate default value, and proper integration with the argument parser.

One suggestion would be to make the help text more specific by including the default value and giving an example of how to specify sizes in human-readable format:
    help="Compute the checksum during DAG computation and job postprocessing "
-        "only for files that are smaller than the provided threshold.",
+        "only for files that are smaller than the provided threshold. Default is 1MB. "
+        "Values can be specified in human-readable format, e.g., '10MB', '1GB'.",

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fee1986 and 8cce025.

📒 Files selected for processing (4)

snakemake/cli.py (3 hunks)
snakemake/dag.py (4 hunks)
snakemake/persistence.py (2 hunks)
snakemake/settings/types.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

snakemake/settings/types.py
snakemake/persistence.py
snakemake/dag.py

🧰 Additional context used

📓 Path-based instructions (1)

`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

snakemake/cli.py

⏰ Context from checks skipped due to timeout of 90000ms (30)

GitHub Check: testing (10, 3.12, bash)
GitHub Check: testing (10, 3.11, bash)
GitHub Check: testing (9, 3.12, bash)
GitHub Check: testing (9, 3.11, bash)
GitHub Check: testing (8, 3.12, bash)
GitHub Check: testing (8, 3.11, bash)
GitHub Check: testing (7, 3.12, bash)
GitHub Check: testing (7, 3.11, bash)
GitHub Check: testing (6, 3.12, bash)
GitHub Check: testing (6, 3.11, bash)
GitHub Check: testing (5, 3.12, bash)
GitHub Check: testing-windows (10)
GitHub Check: testing (5, 3.11, bash)
GitHub Check: testing-windows (9)
GitHub Check: testing (4, 3.12, bash)
GitHub Check: testing-windows (8)
GitHub Check: testing (4, 3.11, bash)
GitHub Check: testing-windows (7)
GitHub Check: testing (3, 3.12, bash)
GitHub Check: testing-windows (6)
GitHub Check: testing (3, 3.11, bash)
GitHub Check: testing-windows (5)
GitHub Check: testing (2, 3.12, bash)
GitHub Check: testing-windows (4)
GitHub Check: testing (2, 3.11, bash)
GitHub Check: testing-windows (3)
GitHub Check: testing (1, 3.12, bash)
GitHub Check: testing-windows (2)
GitHub Check: testing (1, 3.11, bash)
GitHub Check: testing-windows (1)

🔇 Additional comments (2)

snakemake/cli.py (2)

73-76: Consider simplifying by using humanfriendly.parse_size directly

This function is a simple wrapper around humanfriendly.parse_size. As suggested in a previous comment by johanneskoester, you could pass humanfriendly.parse_size directly to the argument parser to eliminate this wrapper function.
- def parse_size_in_bytes(value):
-     from humanfriendly import parse_size
- 
-     return parse_size(value)
Then in the argument parser definition (lines 1317-1324), you would modify it like this:
    group_behavior.add_argument(
        "--max-checksum-file-size",
        default=1000000,
        metavar="BYTES",
-       parse_func=parse_size_in_bytes,
+       parse_func=humanfriendly.parse_size,
        help="Compute the checksum during DAG computation and job postprocessing "
        "only for files that are smaller than the provided threshold.",
    )
Don't forget to add from humanfriendly import parse_size near the top of the file with the other imports.

2121-2121: Correctly integrated the new parameter into DAG settings

The max_checksum_file_size parameter is properly passed to the DAGSettings class, ensuring it will be available during DAG computation.

snakemake/cli.py

sonarqubecloud · 2025-03-12T16:14:36Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

🤖 I have created a release *beep* *boop* --- ## [9.0.0](v8.30.0...v9.0.0) (2025-03-14) ### ⚠ BREAKING CHANGES * Logging refactor & add LoggerPluginInterface ([#3107](#3107)) ### Features * [#3412](#3412) - keep shadow folder of failed job if --keep-incomplete flag is set. ([#3430](#3430)) ([22978c3](22978c3)) * add flag --report-after-run to automatically generate the report after a successfull workflow run ([#3428](#3428)) ([b0a7f03](b0a7f03)) * add flatten function to IO utils ([#3424](#3424)) ([67fa392](67fa392)) * add helper functions to parse input files ([#2918](#2918)) ([63e45a7](63e45a7)) * Add option to print redacted file names ([#3089](#3089)) ([ba4d264](ba4d264)) * add support for validation of polars dataframe and lazyframe ([#3262](#3262)) ([c7473a6](c7473a6)) * added support for rendering dag with mermaid js ([#3409](#3409)) ([7bf8381](7bf8381)) * adding --replace-workflow-config to fully replace workflow configs (from config: directive) with --configfile, instead of merging them ([#3381](#3381)) ([47504a0](47504a0)) * Dynamic module name ([#3401](#3401)) ([024dc32](024dc32)) * Enable saving and reloading IOCache object ([#3386](#3386)) ([c935953](c935953)) * files added in rule params with workflow.source_path will be available in used containers ([#3385](#3385)) ([a6e45bf](a6e45bf)) * Fix keep_local in storage directive and more freedom over remote retrieval behaviour ([#3410](#3410)) ([67b4739](67b4739)) * inherit parameters of use rule and extend/replace individual items them when using 'with' directive ([#3365](#3365)) ([93e4b92](93e4b92)) * Logging refactor & add LoggerPluginInterface ([#3107](#3107)) ([86f1d6e](86f1d6e)) * Maximal file size for checksums ([#3368](#3368)) ([b039f8a](b039f8a)) * Modernize package configuration using Pixi ([#3369](#3369)) ([77992d8](77992d8)) * multiext support for named input/output ([#3372](#3372)) ([05e1378](05e1378)) * optionally auto-group jobs via temp files in case of remote execution ([#3378](#3378)) ([cc9bba2](cc9bba2)) ### Bug Fixes * `--delete-all-output` ignores `--dry-run` ([#3265](#3265)) ([23fef82](23fef82)) * 3342 faster touch runs and warning messages for non-existing files ([#3398](#3398)) ([cd9c3c3](cd9c3c3)) * add default value to max-jobs-per-timespan ([#3043](#3043)) ([2959abe](2959abe)) * checkpoints inside modules are overwritten ([#3359](#3359)) ([fba3ac7](fba3ac7)) * Convert Path to IOFile ([#3405](#3405)) ([c58684c](c58684c)) * Do not perform storage object cleanup with --keep-storage-local-copies set ([#3358](#3358)) ([9a6d14b](9a6d14b)) * edgecases of source deployment in case of remote execution ([#3396](#3396)) ([5da13be](5da13be)) * enhance error message formatting for strict DAG-building mode ([#3376](#3376)) ([a1c39ee](a1c39ee)) * fix bug in checkpoint handling that led to exceptions in case checkpoint output was missing upon rerun ([#3423](#3423)) ([8cf4a2f](8cf4a2f)) * force check all required outputs ([#3341](#3341)) ([495a4e7](495a4e7)) * group job formatting ([#3442](#3442)) ([f0b10a3](f0b10a3)) * in remote jobs, upload storage in topological order such that modification dates are preserved (e.g. in case of group jobs) ([#3377](#3377)) ([eace08f](eace08f)) * only skip eval when resource depends on input ([#3374](#3374)) ([4574c92](4574c92)) * Prevent execution of conda in apptainer when not explicitly requested in software deployment method ([#3388](#3388)) ([c43c5c0](c43c5c0)) * print filenames with quotes around them in RuleException ([#3269](#3269)) ([6baeda5](6baeda5)) * Re-evaluation of free resources ([#3399](#3399)) ([6371293](6371293)) * ReadTheDocs layout issue due to src directory change ([#3419](#3419)) ([695b127](695b127)) * robustly escaping quotes in generated bash scripts (v2) ([#3297](#3297)) ([#3389](#3389)) ([58720bd](58720bd)) * Show apptainer image URL in snakemake report ([#3407](#3407)) ([45f0450](45f0450)) * Update ReadTheDocs configuration for documentation build to use Pixi ([#3433](#3433)) ([3f227a6](3f227a6)) ### Documentation * Add pixi setup instructions to general use tutorial ([#3382](#3382)) ([115e81b](115e81b)) * fix contribution section heading levels, fix docs testing setup order ([#3360](#3360)) ([051dc53](051dc53)) * fix link to github.com/snakemake/poetry-snakemake-plugin ([#3436](#3436)) ([ec6d97c](ec6d97c)) * fix quoting ([#3394](#3394)) ([b40f599](b40f599)) * fix rerun-triggers default ([#3403](#3403)) ([4430e23](4430e23)) * fix typo 'safe' -> 'save' ([#3384](#3384)) ([7755861](7755861)) * mention code formatting in the contribution section ([#3431](#3431)) ([e8682b7](e8682b7)) * remove duplicated 'functions'. ([#3356](#3356)) ([7c595db](7c595db)) * update broken links documentation ([#3437](#3437)) ([e3d0d88](e3d0d88)) * Updating contributing guidelines with new pixi dev setup ([#3415](#3415)) ([8e95a12](8e95a12)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: snakemake-bot <snakemake-bot-admin@googlegroups.com>

Add a new parameter `--max-checksum-file-size` to adjust the maximal file size for which a checksum is computed when the DAG is build. This is related to snakemake#2949 ### QC  * [ ] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [ ] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).  ## Summary by CodeRabbit - **New Features** - Added a new command-line option to let users specify a maximum file size for checksum processing, with a default threshold of 1,000,000 bytes. - **Refactor** - Updated file validation and checksum routines to enforce the new file size limit, ensuring more efficient processing across workflow operations. - Made method signatures more explicit by removing default values for threshold parameters in checksum-related methods.  --------- Co-authored-by: Johanna <johanna_schmitz@posteo.de> Co-authored-by: johannaschmitz <56937653+johannaschmitz@users.noreply.github.com> Co-authored-by: Johannes Köster <johannes.koester@uni-due.de>

🤖 I have created a release *beep* *boop* --- ## [9.0.0](snakemake/snakemake@v8.30.0...v9.0.0) (2025-03-14) ### ⚠ BREAKING CHANGES * Logging refactor & add LoggerPluginInterface ([snakemake#3107](snakemake#3107)) ### Features * [snakemake#3412](snakemake#3412) - keep shadow folder of failed job if --keep-incomplete flag is set. ([snakemake#3430](snakemake#3430)) ([22978c3](snakemake@22978c3)) * add flag --report-after-run to automatically generate the report after a successfull workflow run ([snakemake#3428](snakemake#3428)) ([b0a7f03](snakemake@b0a7f03)) * add flatten function to IO utils ([snakemake#3424](snakemake#3424)) ([67fa392](snakemake@67fa392)) * add helper functions to parse input files ([snakemake#2918](snakemake#2918)) ([63e45a7](snakemake@63e45a7)) * Add option to print redacted file names ([snakemake#3089](snakemake#3089)) ([ba4d264](snakemake@ba4d264)) * add support for validation of polars dataframe and lazyframe ([snakemake#3262](snakemake#3262)) ([c7473a6](snakemake@c7473a6)) * added support for rendering dag with mermaid js ([snakemake#3409](snakemake#3409)) ([7bf8381](snakemake@7bf8381)) * adding --replace-workflow-config to fully replace workflow configs (from config: directive) with --configfile, instead of merging them ([snakemake#3381](snakemake#3381)) ([47504a0](snakemake@47504a0)) * Dynamic module name ([snakemake#3401](snakemake#3401)) ([024dc32](snakemake@024dc32)) * Enable saving and reloading IOCache object ([snakemake#3386](snakemake#3386)) ([c935953](snakemake@c935953)) * files added in rule params with workflow.source_path will be available in used containers ([snakemake#3385](snakemake#3385)) ([a6e45bf](snakemake@a6e45bf)) * Fix keep_local in storage directive and more freedom over remote retrieval behaviour ([snakemake#3410](snakemake#3410)) ([67b4739](snakemake@67b4739)) * inherit parameters of use rule and extend/replace individual items them when using 'with' directive ([snakemake#3365](snakemake#3365)) ([93e4b92](snakemake@93e4b92)) * Logging refactor & add LoggerPluginInterface ([snakemake#3107](snakemake#3107)) ([86f1d6e](snakemake@86f1d6e)) * Maximal file size for checksums ([snakemake#3368](snakemake#3368)) ([b039f8a](snakemake@b039f8a)) * Modernize package configuration using Pixi ([snakemake#3369](snakemake#3369)) ([77992d8](snakemake@77992d8)) * multiext support for named input/output ([snakemake#3372](snakemake#3372)) ([05e1378](snakemake@05e1378)) * optionally auto-group jobs via temp files in case of remote execution ([snakemake#3378](snakemake#3378)) ([cc9bba2](snakemake@cc9bba2)) ### Bug Fixes * `--delete-all-output` ignores `--dry-run` ([snakemake#3265](snakemake#3265)) ([23fef82](snakemake@23fef82)) * 3342 faster touch runs and warning messages for non-existing files ([snakemake#3398](snakemake#3398)) ([cd9c3c3](snakemake@cd9c3c3)) * add default value to max-jobs-per-timespan ([snakemake#3043](snakemake#3043)) ([2959abe](snakemake@2959abe)) * checkpoints inside modules are overwritten ([snakemake#3359](snakemake#3359)) ([fba3ac7](snakemake@fba3ac7)) * Convert Path to IOFile ([snakemake#3405](snakemake#3405)) ([c58684c](snakemake@c58684c)) * Do not perform storage object cleanup with --keep-storage-local-copies set ([snakemake#3358](snakemake#3358)) ([9a6d14b](snakemake@9a6d14b)) * edgecases of source deployment in case of remote execution ([snakemake#3396](snakemake#3396)) ([5da13be](snakemake@5da13be)) * enhance error message formatting for strict DAG-building mode ([snakemake#3376](snakemake#3376)) ([a1c39ee](snakemake@a1c39ee)) * fix bug in checkpoint handling that led to exceptions in case checkpoint output was missing upon rerun ([snakemake#3423](snakemake#3423)) ([8cf4a2f](snakemake@8cf4a2f)) * force check all required outputs ([snakemake#3341](snakemake#3341)) ([495a4e7](snakemake@495a4e7)) * group job formatting ([snakemake#3442](snakemake#3442)) ([f0b10a3](snakemake@f0b10a3)) * in remote jobs, upload storage in topological order such that modification dates are preserved (e.g. in case of group jobs) ([snakemake#3377](snakemake#3377)) ([eace08f](snakemake@eace08f)) * only skip eval when resource depends on input ([snakemake#3374](snakemake#3374)) ([4574c92](snakemake@4574c92)) * Prevent execution of conda in apptainer when not explicitly requested in software deployment method ([snakemake#3388](snakemake#3388)) ([c43c5c0](snakemake@c43c5c0)) * print filenames with quotes around them in RuleException ([snakemake#3269](snakemake#3269)) ([6baeda5](snakemake@6baeda5)) * Re-evaluation of free resources ([snakemake#3399](snakemake#3399)) ([6371293](snakemake@6371293)) * ReadTheDocs layout issue due to src directory change ([snakemake#3419](snakemake#3419)) ([695b127](snakemake@695b127)) * robustly escaping quotes in generated bash scripts (v2) ([snakemake#3297](snakemake#3297)) ([snakemake#3389](snakemake#3389)) ([58720bd](snakemake@58720bd)) * Show apptainer image URL in snakemake report ([snakemake#3407](snakemake#3407)) ([45f0450](snakemake@45f0450)) * Update ReadTheDocs configuration for documentation build to use Pixi ([snakemake#3433](snakemake#3433)) ([3f227a6](snakemake@3f227a6)) ### Documentation * Add pixi setup instructions to general use tutorial ([snakemake#3382](snakemake#3382)) ([115e81b](snakemake@115e81b)) * fix contribution section heading levels, fix docs testing setup order ([snakemake#3360](snakemake#3360)) ([051dc53](snakemake@051dc53)) * fix link to github.com/snakemake/poetry-snakemake-plugin ([snakemake#3436](snakemake#3436)) ([ec6d97c](snakemake@ec6d97c)) * fix quoting ([snakemake#3394](snakemake#3394)) ([b40f599](snakemake@b40f599)) * fix rerun-triggers default ([snakemake#3403](snakemake#3403)) ([4430e23](snakemake@4430e23)) * fix typo 'safe' -> 'save' ([snakemake#3384](snakemake#3384)) ([7755861](snakemake@7755861)) * mention code formatting in the contribution section ([snakemake#3431](snakemake#3431)) ([e8682b7](snakemake@e8682b7)) * remove duplicated 'functions'. ([snakemake#3356](snakemake#3356)) ([7c595db](snakemake@7c595db)) * update broken links documentation ([snakemake#3437](snakemake#3437)) ([e3d0d88](snakemake@e3d0d88)) * Updating contributing guidelines with new pixi dev setup ([snakemake#3415](snakemake#3415)) ([8e95a12](snakemake@8e95a12)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: snakemake-bot <snakemake-bot-admin@googlegroups.com>

JZentgraf and others added 4 commits March 10, 2025 17:34

start introducing file size threshold for checksum

88f1168

fix checksum threshold in persistence

3025ed2

argparse help message for max-checksum-file-size

853a363

Merge branch 'snakemake:main' into checksums

9476d7b

JZentgraf requested a review from johanneskoester as a code owner March 11, 2025 08:33

coderabbitai bot reviewed Mar 11, 2025

View reviewed changes

JZentgraf and others added 4 commits March 11, 2025 09:42

fix format in dag.py

277ceae

formatting persistence.py

79f4f3a

fix wrong order of arguments

117ed5a

Merge branch 'main' into checksums

e4f92ce

coderabbitai bot reviewed Mar 11, 2025

View reviewed changes

johanneskoester requested changes Mar 11, 2025

View reviewed changes

snakemake/cli.py Show resolved Hide resolved

JZentgraf and others added 4 commits March 11, 2025 13:50

add support for human friendly (1M, 1G, ...) input

0e3deca

Merge branch 'main' into checksums

b99c4d8

new line formatting

9cb22bb

Merge branch 'main' into checksums

90150c8

coderabbitai bot reviewed Mar 11, 2025

View reviewed changes

Merge branch 'main' into checksums

acb2845

coderabbitai bot reviewed Mar 11, 2025

View reviewed changes

JZentgraf requested a review from johanneskoester March 11, 2025 13:43

Merge branch 'main' into checksums

fee1986

coderabbitai bot reviewed Mar 11, 2025

View reviewed changes

Merge branch 'main' into checksums

8cce025

coderabbitai bot reviewed Mar 12, 2025

View reviewed changes

johanneskoester reviewed Mar 12, 2025

View reviewed changes

snakemake/cli.py Outdated Show resolved Hide resolved

johanneskoester reviewed Mar 12, 2025

View reviewed changes

snakemake/cli.py Outdated Show resolved Hide resolved

johanneskoester added 2 commits March 12, 2025 11:45

Update snakemake/cli.py

b6e59af

Update snakemake/cli.py

1526067

johanneskoester approved these changes Mar 12, 2025

View reviewed changes

johanneskoester enabled auto-merge (squash) March 12, 2025 10:50

Merge branch 'main' into checksums

1696232

johanneskoester merged commit b039f8a into snakemake:main Mar 12, 2025
41 checks passed

github-actions bot mentioned this pull request Mar 12, 2025

chore(main): release 9.0.0 #3380

Merged

Conversation

JZentgraf commented Mar 11, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

QC

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

johanneskoester left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud bot commented Mar 12, 2025

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

JZentgraf commented Mar 11, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 11, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)