Skip to content

fix: --delete-all-output ignores --dry-run#3265

Merged
johanneskoester merged 9 commits intosnakemake:mainfrom
GaspardR:delete_all_output_dryrun
Mar 13, 2025
Merged

fix: --delete-all-output ignores --dry-run#3265
johanneskoester merged 9 commits intosnakemake:mainfrom
GaspardR:delete_all_output_dryrun

Conversation

@GaspardR
Copy link
Copy Markdown
Contributor

@GaspardR GaspardR commented Jan 22, 2025

Fixes #3264

QC

  • The PR contains a test case for the changes or the changes are already covered by an existing test case.
  • The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

Should I add a test case for this?

Summary by CodeRabbit

  • New Features

    • Added a dry run option for output deletion, allowing users to simulate deletions before execution.
    • Enhanced workflow commands to improve file and directory management during output operations.
  • Bug Fixes

    • Corrected a file path reference in deletion validation to ensure more reliable output handling.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Jan 22, 2025

📝 Walkthrough

Walkthrough

This pull request introduces a new test function in the test suite that verifies the dry run behavior for deleting all outputs. Two new Snakefile workflows are added to manage outputs and temporary files through a set of defined rules, including invoking an inner workflow for various file operations. Additionally, a shell command in an existing Snakefile is updated to correct a path reference. In the CLI, the deletion function is modified to accept a dry run flag, allowing for simulation of deletion actions.

Changes

File(s) Change Summary
tests/tests.py Added new test function test_github_issue_3265_respect_dryrun_delete_all() decorated with @skip_on_windows to validate dry run deletion functionality.
tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile
tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile_inner
Introduced new Snakemake workflows with several rules for managing outputs, temporary files, and directory manipulations. These files define rules for aggregating outputs, deleting all outputs, and handling temporary outputs through inner workflow calls.
tests/test_delete_all_output/Snakefile Modified shell command in the delete_all_output rule to update the file path reference from somedir/final to some_dir/final.
src/snakemake/cli.py Updated the args_to_api function to pass the dryrun flag to the delete_output method, enabling conditional (dry run) execution of the deletion process.

Sequence Diagram(s)

sequenceDiagram
    participant CLI
    participant DAG_API
    participant FileSystem

    CLI->>DAG_API: delete_output(dryrun)
    alt Dry run enabled
        DAG_API->>FileSystem: Simulate deletion of outputs
        FileSystem-->>DAG_API: Return simulation result
    else Actual deletion
        DAG_API->>FileSystem: Delete outputs
        FileSystem-->>DAG_API: Confirm deletion
    end
    DAG_API-->>CLI: Return operation result
Loading
sequenceDiagram
    participant Rule as delete_all_output Rule
    participant Shell
    participant InnerWorkflow as Snakefile_inner

    Rule->>Shell: Execute: echo PATH, list directory, run snakemake -s Snakefile_inner
    Shell-->>Rule: Return output from inner workflow
    Rule->>Shell: Execute dry run deletion command
    Shell-->>Rule: Return dry run simulation result
Loading

Suggested reviewers

  • johanneskoester

Tip

⚡🧪 Multi-step agentic review comment chat (experimental)
  • We're introducing multi-step agentic chat in review comments. This experimental feature enhances review discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments.
    - To enable this feature, set early_access to true under in the settings.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e628a95 and 8e51364.

📒 Files selected for processing (5)
  • src/snakemake/cli.py (1 hunks)
  • tests/test_delete_all_output/Snakefile (1 hunks)
  • tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile (1 hunks)
  • tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile_inner (1 hunks)
  • tests/tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • tests/test_delete_all_output/Snakefile
  • tests/tests.py
  • tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile
  • tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile_inner
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • src/snakemake/cli.py
🔇 Additional comments (1)
src/snakemake/cli.py (1)

2096-2096: Fix for --delete-all-output to respect --dry-run flag

This change correctly adds the dryrun parameter to the delete_output method call, ensuring that the --delete-all-output option respects the --dry-run flag. The temporary output deletion already had this parameter (line 2098), so this change ensures consistent behavior across both output deletion options.

This addresses GitHub issue #3264 as mentioned in the PR description.

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
snakemake/cli.py (1)

2152-2152: LGTM! Important safety improvement.

The change correctly ensures that --delete-all-output respects the --dry-run flag by passing the dryrun parameter to delete_output(). This is a critical safety improvement that prevents accidental file deletion when dry run mode is enabled.

This change follows the principle of least surprise by making the --dry-run flag behavior consistent across all operations, including destructive ones like --delete-all-output.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7a75043 and 0732558.

📒 Files selected for processing (1)
  • snakemake/cli.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/cli.py (1)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

@bayraktar1 bayraktar1 self-assigned this Mar 11, 2025
@bayraktar1 bayraktar1 added the bug Something isn't working label Mar 11, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 3

🔭 Outside diff range comments (1)
snakemake/ioutils/input.py (1)

12-31: ⚠️ Potential issue

Fix several issues in the extract_checksum function.

The function has a few issues that need to be addressed:

  1. The WorkflowError is undefined and needs to be imported.
  2. The lambda function should be defined as a regular function per best practices.
  3. The exception handling doesn't preserve the traceback.

Apply these changes:

+from snakemake.exceptions import WorkflowError

def parse_input(infile, parser, **kwargs):
    def inner(wildcards, input, output):
        with open(infile, "r") as fh:
            if parser is None:
                return fh.read().strip()
            else:
                return parser(fh, **kwargs)

    return inner


def extract_checksum(infile, **kwargs):
    try:
        import pandas as pd

-        fix_file_name = lambda x: x.removeprefix("./")
+        def fix_file_name(x):
+            return x.removeprefix("./")
        return (
            pd.read_csv(
                infile,
                sep="  ",
                header=None,
                engine="python",
                converters={1: fix_file_name},
            )
            .set_index(1)
            .loc[fix_file_name(kwargs.get("file"))]
            .item()
        )
    except ImportError as err:
-        raise WorkflowError("Pandas is required to extract checksum from file.")
+        raise WorkflowError("Pandas is required to extract checksum from file.") from err
🧰 Tools
🪛 Ruff (0.8.2)

16-16: Do not assign a lambda expression, use a def

Rewrite fix_file_name as a def

(E731)


30-30: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


30-30: Undefined name WorkflowError

(F821)

🧹 Nitpick comments (36)
snakemake/deployment/singularity.py (1)

130-130: Removal of --cleanenv flag may affect container environment isolation

The --cleanenv flag has been removed from the singularity command. This flag previously prevented the host environment from being passed to the container (except for explicitly set variables).

Consider documenting this change if it's intentional, as it changes the container isolation behavior. If this change is related to the PR objective of fixing --delete-all-output respecting --dry-run, please add a comment explaining the connection.

snakemake/utils.py (3)

111-117: Ensure clarity on external variables in nested function
The nested function _validate_record(record) relies on set_default, schema, and resolver from the outer scope. While this works due to Python scoping rules, it may be clearer to pass these as parameters. If code maintainability is a concern, refactoring the function signature to accept them directly can improve readability.


150-218: Partially validating polars LazyFrame & raising from original exceptions

  1. For LazyFrame (lines 189–212), only the first 1000 rows are checked. Consider a configurable chunk to better control validation time for large datasets.
  2. As with _validate_pandas, use raise ... from e at lines 205–208 for improved traceback linking:
-except jsonschema.exceptions.ValidationError as e:
-    raise WorkflowError(f"Error validating row {i} of data frame.", e)
+except jsonschema.exceptions.ValidationError as e:
+    raise WorkflowError(f"Error validating row {i} of data frame.") from e
🧰 Tools
🪛 Ruff (0.8.2)

168-170: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


206-208: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


219-230: Streamline exceptions when validating dictionary
At line 224, apply the same practice of raising from the original exception:

-raise WorkflowError("Error validating config file.", e)
+raise WorkflowError("Error validating config file.") from e
🧰 Tools
🪛 Ruff (0.8.2)

224-224: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

docs/snakefiles/deployment.rst (1)

429-429: Minor rewording suggestion
The introductory sentence for container usage could be expanded for clarity, e.g. “As an alternative to using Conda for package management, you can isolate and run entire jobs in containers.”

snakemake/script/__init__.py (1)

56-103: Consider simplifying anchor logic
In the ReportHref class, lines 97–100 could be shortened using the Python inline if-expression, which can be more concise:

-if self._anchor:
-    anchor = f"#{urllib.parse.quote(self._anchor)}"
-else:
-    anchor = ""
+anchor = f"#{urllib.parse.quote(self._anchor)}" if self._anchor else ""

Otherwise, the overall approach for building hierarchical links is well-structured.

🧰 Tools
🪛 Ruff (0.8.2)

97-100: Use ternary operator anchor = f"#{urllib.parse.quote(self._anchor)}" if self._anchor else "" instead of if-else-block

Replace if-else-block with anchor = f"#{urllib.parse.quote(self._anchor)}" if self._anchor else ""

(SIM108)

tests/tests.py (1)

18-18: Remove unused import.

Since patch("snakemake.utils.min_version", ...) references the attribute by string, this local import isn’t used. Please remove it to clean up.

- from snakemake.utils import min_version  # import so we can patch out if needed
🧰 Tools
🪛 Ruff (0.8.2)

18-18: snakemake.utils.min_version imported but unused

Remove unused import: snakemake.utils.min_version

(F401)

misc/vim/syntax/snakemake.vim (1)

4-4: Check the "Last Change" date

The last change date is set to 2025 Mar 6, which is in the future. Consider updating it to the current date.

-" Last Change: 2025 Mar 6
+" Last Change: 2023 Mar 6
snakemake/report/html_reporter/template/components/common.js (1)

19-21: Add missing semicolon for consistency

The new function is missing a semicolon at the end of the return statement, which is inconsistent with the coding style in the rest of the file.

function getResultMimeType(resultPath) {
-    return results[resultPath].mime_type
+    return results[resultPath].mime_type;
}
snakemake/report/html_reporter/template/components/toggle.js (3)

1-1: Remove redundant 'use strict' directive

The 'use strict' directive is redundant in ES modules as they operate in strict mode by default.

-'use strict';
🧰 Tools
🪛 Biome (1.9.4)

[error] 1-1: Redundant use strict directive.

The entire contents of JavaScript modules are automatically in strict mode, with no statement needed to initiate it.
Safe fix: Remove the redundant use strict directive.

(lint/suspicious/noRedundantUseStrict)


13-13: Remove console.log statement

It's best practice to remove debugging console.log statements before merging to production code.

        // swap the value
-        console.log(selected);
        this.setState({ value: selected });

78-110: Consider removing commented-out code

The commented-out render2 method adds unnecessary noise to the codebase. If this alternative rendering approach isn't needed, consider removing it entirely. If it might be needed in the future, add a TODO comment explaining why it's kept.

snakemake/caching/__init__.py (2)

8-18: Consider removing the unused logger import.

Static analysis flags snakemake.logging.logger as unused. If there's no plan to use it, removing it helps keep dependencies clean.

🧰 Tools
🪛 Ruff (0.8.2)

18-18: snakemake.logging.logger imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


30-30: Chain the original KeyError for clearer debugging.

When re-raising from the KeyError, consider using raise MissingOutputFileCachePathException() from e to preserve traceback context.

🧰 Tools
🪛 Ruff (0.8.2)

30-30: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

tests/test_output_index.py (1)

1-158: Excellent test coverage and clear parameterization.

All tests look functional and robust. You could optionally refactor repeated mocks into a helper fixture for cleaner code and reusability.

tests/test_validate/Snakefile (1)

4-73: Extensive data validation additions look correct.

You've effectively tested with both Pandas and Polars, including LazyFrames. Consider adding a negative test that intentionally violates the schema to confirm proper error handling.

snakemake/workflow.py (3)

127-127: Remove unused import.

The snakemake.api import is not referenced within this file, causing a lint warning. Consider removing it to reduce clutter.

Apply this diff to remove the unused import:

-from snakemake import api, caching, sourcecache
+from snakemake import caching, sourcecache
🧰 Tools
🪛 Ruff (0.8.2)

127-127: snakemake.api imported but unused

Remove unused import: snakemake.api

(F401)


256-259: Address the TODO or open a dedicated issue.

There's a TODO comment regarding potential side effects on the source cache. If this needs further investigation, consider opening an issue or providing an explanatory comment for maintainers.

Would you like me to open a new issue or draft a possible fix?


1813-1813: Avoid inequality comparison to False.

To improve readability and align with Pythonic style, consider checking truthiness rather than comparing to False.

Apply the following diff to simplify the condition:

- if not ruleinfo.template_engine and ruleinfo.container_img != False:
+ if not ruleinfo.template_engine and ruleinfo.container_img:
🧰 Tools
🪛 Ruff (0.8.2)

1813-1813: Avoid inequality comparisons to False; use if ruleinfo.container_img: for truth checks

Replace with ruleinfo.container_img

(E712)

snakemake/report/html_reporter/template/components/abstract_view_manager.js (1)

1-35: Implement or track the “Unimplemented!” handlers.

The handlers currently throw errors, so they are clearly placeholders. If this is intentional, consider creating tickets or stubs to ensure these methods are fully implemented in a subsequent update.

Would you like me to create an issue for each handler or propose a minimal implementation?

.github/workflows/main.yml (3)

46-54: Keep 'exclude' from being empty

Actionlint warns about an empty exclude block. If it is not needed, consider removing or commenting it out.

-        exclude: []
+        # exclude: []
🧰 Tools
🪛 actionlint (1.7.4)

52-52: "exclude" section should not be empty

(syntax-check)

🪛 YAMLlint (1.35.1)

[error] 53-53: trailing spaces

(trailing-spaces)


102-115: Store credentials securely

MinIO credentials are in plain text. Consider using GitHub secrets or environment variables to avoid exposing them.

-        with:
-          access_key: minio
-          secret_key: minio123
+        with:
+          access_key: ${{ secrets.MINIO_ACCESS_KEY }}
+          secret_key: ${{ secrets.MINIO_SECRET_KEY }}

211-218: Add a newline at end of file

YAMLlint flags a missing newline at the end of the file (line 218). Please add one to comply with best practices.

+      - run: echo "All tests passed."

+(Add an empty line here)
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 218-218: no new line character at the end of file

(new-line-at-end-of-file)

snakemake/dag.py (2)

419-424: Consider clarifying multi-line condition

Although valid, you might extract each condition into separate variables for better readability.

-        return (
-            f.is_storage
-            and f
-            not in self.workflow.storage_settings.unneeded_temp_files
-            and await f.exists_local()
-        )
+        is_storage = f.is_storage
+        is_needed_temp = f not in self.workflow.storage_settings.unneeded_temp_files
+        is_local = await f.exists_local()
+        return is_storage and is_needed_temp and is_local

427-431: Conditional storing logic

Storing outputs only after the job is finished is sound. Consider separate error-handling paths for logs vs. main output if an exception occurs.

snakemake/report/html_reporter/template/components/abstract_results.js (5)

1-3: Potential delimiter collisions

If array elements contain commas, joining them with “,” could cause collisions in the resulting key. Consider a safer delimiter or an escape mechanism.


41-48: Stateful initialization

Storing toggles and data in the component’s state is appropriate. Watch out for large data sets.


67-73: Initialize toggles carefully

value[0] might be undefined if ‘values’ is empty. Confirm there's always at least one default entry.


109-246: Result table generation

Consider adding accessibility enhancements (captions or ARIA attributes) for better usability.


272-320: Row rendering logic

Constructing rows with label values plus action buttons is straightforward. Using a stable key in React for each row is recommended.

snakemake/report/html_reporter/template/components/result_view_button.js (1)

4-70: Consider consolidating repeated logic in the handler methods.

The methods handleImg, handleHtml, and handlePdf share a similar structure, each defining button properties and calling app.setView(...) with only minor differences. You might consider extracting common logic into a helper function that accepts the content type, thus reducing duplication.

snakemake/io.py (4)

686-688: Attach original exception context when re-raising.

Raising:

raise WorkflowError(
    f"Failed to create output directory {dirpath}.", e
)

can lose traceback context. Consider using raise WorkflowError(...) from e to preserve the original exception details, as suggested by static analysis (B904).

🧰 Tools
🪛 Ruff (0.8.2)

686-688: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


693-697: Preserve the original FIFO creation error context.

Similarly, for the FIFO creation error, switch to:

raise WorkflowError(
    f"Failed to create FIFO for pipe output {self._file}.", e
) from e

to retain the underlying exception’s traceback.

🧰 Tools
🪛 Ruff (0.8.2)

695-697: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


945-966: Document new parameter for local consideration.

The _CONSIDER_LOCAL_DEFAULT and the new consider_local parameter in wait_for_files are introduced to selectively skip checking storage for certain files. This is a useful extension, but consider adding a docstring or comment clarifying how and when the user should populate consider_local.


1780-1780: Clarify async usage in property.

return async_run(sizes()) within a property might cause confusion since it triggers an async routine in a synchronous context. If this becomes more complex, consider refactoring to a standard async method or clarifying its design so that future maintenance is straightforward.

tests/tests_using_conda.py (1)

61-65: Review or remove the commented-out test portion.

The code around lines 61–65 references a known PermissionError issue with Conda on Windows, commented out for now. If this workaround is no longer needed, consider removing it or marking it clearly with a rationale for future reference.

🛑 Comments failed to post (3)
snakemake/utils.py (1)

118-149: ⚠️ Potential issue

Consider raising from original exception for better traceback
When catching jsonschema.exceptions.ValidationError at lines 131–132 and raising a new WorkflowError, use the raise ... from e form to preserve the original traceback, e.g.:

-except jsonschema.exceptions.ValidationError as e:
-    raise WorkflowError(f"Error validating row {i} of data frame.", e)
+except jsonschema.exceptions.ValidationError as e:
+    raise WorkflowError(f"Error validating row {i} of data frame.") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    def _validate_pandas(data):
        try:
            import pandas as pd

            if isinstance(data, pd.DataFrame):
                logger.debug("Validating pandas DataFrame")

                recordlist = []
                for i, record in enumerate(data.to_dict("records")):
                    # Exclude NULL values
                    record = {k: v for k, v in record.items() if pd.notnull(v)}
                    try:
                        recordlist.append(_validate_record(record))
                    except jsonschema.exceptions.ValidationError as e:
                        raise WorkflowError(f"Error validating row {i} of data frame.") from e

                if set_default:
                    newdata = pd.DataFrame(recordlist, data.index)
                    # Add missing columns
                    newcol = newdata.columns[~newdata.columns.isin(data.columns)]
                    data[newcol] = None
                    # Fill in None values with values from newdata
                    data.update(newdata)

            else:
                return False
        except ImportError:
            return False
        return True
🧰 Tools
🪛 Ruff (0.8.2)

132-134: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

snakemake/report/html_reporter/template/components/toggle.js (1)

42-42: ⚠️ Potential issue

Use 'htmlFor' instead of 'for' in React

In React JSX, the HTML attribute 'for' should be written as 'htmlFor'.

-                    {for: valueId(0), className: `${labelClasses} rounded-l`},
+                    {htmlFor: valueId(0), className: `${labelClasses} rounded-l`},
-                    {for: valueId(1), className: `${labelClasses} rounded-r`},
+                    {htmlFor: valueId(1), className: `${labelClasses} rounded-r`},

Also applies to: 59-59

snakemake/jobs.py (1)

1585-1598: 🛠️ Refactor suggestion

Improve exception chaining.

When re-raising an exception in the except ExceptionGroup block, consider Python's recommended practice of using raise ... from e to preserve traceback context.

     except ExceptionGroup as e:
-        raise WorkflowError(
+        raise WorkflowError(
             f"Error postprocessing group job {self.jobid}.",
             *e.exceptions,
+        ) from e
-        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

                async with asyncio.TaskGroup() as tg:
                    for job in level:
                        # postprocessing involves touching output files (to ensure that
                        # modification times are always correct. This has to happen in
                        # topological order, such that they are not mixed up.
                        # We have to disable output mtime checks here
                        # (at least for now), because they interfere with the update of
                        # intermediate file mtimes that might happen in previous levels.
                        tg.create_task(
                            job.postprocess(
                                error=error, check_output_mtime=False, **kwargs
                            )
                        )
            except ExceptionGroup as e:
                raise WorkflowError(
                    f"Error postprocessing group job {self.jobid}.",
                    *e.exceptions,
                ) from e

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile (1)

7-19: Consider using rm -f nothere.

If “nothere” is not guaranteed to exist at run-time, rm nothere may fail unexpectedly. Otherwise, this rule successfully tests that --delete-all-output in dry-run mode preserves files.

-        rm nothere && \
+        rm -f nothere && \
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3611073 and 5c61911.

📒 Files selected for processing (4)
  • tests/tests.py (7 hunks)
  • tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile (1 hunks)
  • tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile_inner (1 hunks)
  • tests/tests.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • tests/tests.py
🪛 Ruff (0.8.2)
tests/tests.py

18-18: snakemake.utils.min_version imported but unused

Remove unused import: snakemake.utils.min_version

(F401)

🔇 Additional comments (22)
tests/tests.py (9)

18-18: Ignore the static analysis hint about removing min_version.

Even though Ruff flags it as unused, this import is utilized at lines 1483 and 1493 for patching in test functions. It should remain in the code.

🧰 Tools
🪛 Ruff (0.8.2)

18-18: snakemake.utils.min_version imported but unused

Remove unused import: snakemake.utils.min_version

(F401)


20-20: Looks good!

This import is properly utilized for batch processing tests.


25-25: Imports are valid and in use.

All imported items (run, dpath, apptainer, connected) are referenced appropriately throughout the file.


263-265: New test function is fine.

A succinct test for test_report_href. Everything appears in line with the existing testing conventions.


613-616: No concerns with this new test setup.

test_default_storage extends coverage for S3 usage, skipping Windows as expected. Looks good.


641-642: LGTM.

test_storage continues the same S3-based storage testing strategy while skipping Windows. No issues noted.


1483-1485: Effective patch usage.

Patching min_version to avoid environment-based failures is a practical approach. Implementation looks correct.


1493-1496: Consistent min_version patching approach.

Same patch strategy as above, ensuring stable execution in transient environments. No issues found.


2016-2016: New test function aligns well with existing patterns.

test_shell_exec_singularity is consistent with other singularity tests. Declaration and skip logic are good.

tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile (3)

1-2: Shell executable declaration is clear.

Defaulting to bash aligns with Snakemake’s typical usage for shell commands. No issues here.


3-6: Top-level aggregation rule is straightforward.

Collecting outputs into the “all_ok” target is a standard pattern. Implementation is fine.


20-29: No concerns about the delete_temp_output rule.

This rule parallels the previous one, verifying that temp files remain intact under a dry-run. Everything looks good.

tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile_inner (10)

1-2: Shell executable set to bash.

Matches the overall style of the workflow for consistent shell usage.


3-8: Aggregating multiple outputs into all.

A straightforward method for final targets. Implementation is clear and functional.


9-13: Rule a implements symlinks and placeholders effectively.

Creation of a dangling symlink and touch on a protected file is well-structured. Note that touch nothere ensures “nothere” exists for subsequent usage in this workflow.


14-17: Rule b is direct and minimal.

Consumes “intermediate” and produces a final file. No issues found.


18-21: Rule c for empty_dir creation.

Straightforward mkdir command. Matches Snakemake’s directory() usage.


22-25: Rule d sets up a “full_dir” with a file.

Implementation is coherent and aligns with standard directory creation patterns.


26-30: Rule e exemplifies temp usage.

Produces a temporary file that can be cleaned by Snakemake. Code is clear and correct.


31-35: Rule f creates a temp directory.

Directory marking with temp() is well done for ephemeral data. No issues.


36-40: Rule g extends the temp directory pattern.

Identical approach to f, but populates a file inside. Works as intended.


41-43: Final temp rule merges temp directories.

Combines them into temp_keep. Straightforward usage of touch output. No concerns.

@bayraktar1
Copy link
Copy Markdown
Member

In addition, we fixed a typo in the 'some_dir' directory in the 'test_delete_all_output' test.

The mistake went unnoticed because the test checks if the directory is deleted.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile_inner (1)

70-76: Missing newline at end of file.

The file is missing a newline character at the end, which is a minor style issue but should be fixed to adhere to common coding standards.

    output:
        touch("temp_keep"),
+
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 11cbe89 and 7d5fac0.

📒 Files selected for processing (3)
  • tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile (1 hunks)
  • tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile_inner (1 hunks)
  • tests/tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/tests.py
  • tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile
🔇 Additional comments (8)
tests/test_github_issue_3265_respect_dryrun_delete_all/Snakefile_inner (8)

1-1: Good practice setting explicit shell executable.

Setting the shell executable to bash ensures consistent behavior across different environments.


4-9: LGTM - well-structured rule all.

The rule properly defines all the expected final outputs of the workflow, which will help test if these outputs are correctly handled when --delete-all-output respects --dry-run.


11-19: Comprehensive rule testing various output types.

This rule effectively tests multiple output types (regular file, symlink, protected file) which is good for validating the --delete-all-output behavior with --dry-run.


29-33: LGTM - clear directory creation test.

This rule correctly tests creation of an empty directory output, which is important for validating the --delete-all-output with --dry-run functionality.


36-40: LGTM - correctly tests directory with content.

This rule creates a directory containing a file, which is good for testing how --delete-all-output handles non-empty directories when --dry-run is specified.


43-49: LGTM - correctly tests temporary files.

The rule properly uses the temp() directive which marks the output for automatic deletion by Snakemake when no longer needed, providing good coverage for testing temporary file behavior.


52-58: LGTM - properly tests temporary empty directories.

The rule correctly combines directory() and temp() directives to test temporary directory handling.


61-67: LGTM - properly tests temporary directories with content.

This rule effectively tests how temporary directories with content are handled, which is important for comprehensive testing of the --delete-all-output functionality.

@bayraktar1
Copy link
Copy Markdown
Member

I was unable to determine the cause of test failure on Windows after a significant amount of effort. The test passes on OSX and Linux systems.

@sonarqubecloud
Copy link
Copy Markdown

@bayraktar1 bayraktar1 force-pushed the delete_all_output_dryrun branch from fde47d7 to 8e51364 Compare March 13, 2025 10:27
@johanneskoester johanneskoester merged commit 23fef82 into snakemake:main Mar 13, 2025
49 checks passed
@github-project-automation github-project-automation bot moved this from In progress to Done in Snakemake Hackathon March 2025 Mar 13, 2025
@coderabbitai coderabbitai bot mentioned this pull request Mar 13, 2025
2 tasks
johanneskoester pushed a commit that referenced this pull request Mar 14, 2025
🤖 I have created a release *beep* *boop*
---


##
[9.0.0](v8.30.0...v9.0.0)
(2025-03-14)


### ⚠ BREAKING CHANGES

* Logging refactor & add LoggerPluginInterface
([#3107](#3107))

### Features

* [#3412](#3412) - keep
shadow folder of failed job if --keep-incomplete flag is set.
([#3430](#3430))
([22978c3](22978c3))
* add flag --report-after-run to automatically generate the report after
a successfull workflow run
([#3428](#3428))
([b0a7f03](b0a7f03))
* add flatten function to IO utils
([#3424](#3424))
([67fa392](67fa392))
* add helper functions to parse input files
([#2918](#2918))
([63e45a7](63e45a7))
* Add option to print redacted file names
([#3089](#3089))
([ba4d264](ba4d264))
* add support for validation of polars dataframe and lazyframe
([#3262](#3262))
([c7473a6](c7473a6))
* added support for rendering dag with mermaid js
([#3409](#3409))
([7bf8381](7bf8381))
* adding --replace-workflow-config to fully replace workflow configs
(from config: directive) with --configfile, instead of merging them
([#3381](#3381))
([47504a0](47504a0))
* Dynamic module name
([#3401](#3401))
([024dc32](024dc32))
* Enable saving and reloading IOCache object
([#3386](#3386))
([c935953](c935953))
* files added in rule params with workflow.source_path will be available
in used containers
([#3385](#3385))
([a6e45bf](a6e45bf))
* Fix keep_local in storage directive and more freedom over remote
retrieval behaviour
([#3410](#3410))
([67b4739](67b4739))
* inherit parameters of use rule and extend/replace individual items
them when using 'with' directive
([#3365](#3365))
([93e4b92](93e4b92))
* Logging refactor & add LoggerPluginInterface
([#3107](#3107))
([86f1d6e](86f1d6e))
* Maximal file size for checksums
([#3368](#3368))
([b039f8a](b039f8a))
* Modernize package configuration using Pixi
([#3369](#3369))
([77992d8](77992d8))
* multiext support for named input/output
([#3372](#3372))
([05e1378](05e1378))
* optionally auto-group jobs via temp files in case of remote execution
([#3378](#3378))
([cc9bba2](cc9bba2))


### Bug Fixes

* `--delete-all-output` ignores `--dry-run`
([#3265](#3265))
([23fef82](23fef82))
* 3342 faster touch runs and warning messages for non-existing files
([#3398](#3398))
([cd9c3c3](cd9c3c3))
* add default value to max-jobs-per-timespan
([#3043](#3043))
([2959abe](2959abe))
* checkpoints inside modules are overwritten
([#3359](#3359))
([fba3ac7](fba3ac7))
* Convert Path to IOFile
([#3405](#3405))
([c58684c](c58684c))
* Do not perform storage object cleanup with --keep-storage-local-copies
set ([#3358](#3358))
([9a6d14b](9a6d14b))
* edgecases of source deployment in case of remote execution
([#3396](#3396))
([5da13be](5da13be))
* enhance error message formatting for strict DAG-building mode
([#3376](#3376))
([a1c39ee](a1c39ee))
* fix bug in checkpoint handling that led to exceptions in case
checkpoint output was missing upon rerun
([#3423](#3423))
([8cf4a2f](8cf4a2f))
* force check all required outputs
([#3341](#3341))
([495a4e7](495a4e7))
* group job formatting
([#3442](#3442))
([f0b10a3](f0b10a3))
* in remote jobs, upload storage in topological order such that
modification dates are preserved (e.g. in case of group jobs)
([#3377](#3377))
([eace08f](eace08f))
* only skip eval when resource depends on input
([#3374](#3374))
([4574c92](4574c92))
* Prevent execution of conda in apptainer when not explicitly requested
in software deployment method
([#3388](#3388))
([c43c5c0](c43c5c0))
* print filenames with quotes around them in RuleException
([#3269](#3269))
([6baeda5](6baeda5))
* Re-evaluation of free resources
([#3399](#3399))
([6371293](6371293))
* ReadTheDocs layout issue due to src directory change
([#3419](#3419))
([695b127](695b127))
* robustly escaping quotes in generated bash scripts (v2)
([#3297](#3297))
([#3389](#3389))
([58720bd](58720bd))
* Show apptainer image URL in snakemake report
([#3407](#3407))
([45f0450](45f0450))
* Update ReadTheDocs configuration for documentation build to use Pixi
([#3433](#3433))
([3f227a6](3f227a6))


### Documentation

* Add pixi setup instructions to general use tutorial
([#3382](#3382))
([115e81b](115e81b))
* fix contribution section heading levels, fix docs testing setup order
([#3360](#3360))
([051dc53](051dc53))
* fix link to github.com/snakemake/poetry-snakemake-plugin
([#3436](#3436))
([ec6d97c](ec6d97c))
* fix quoting
([#3394](#3394))
([b40f599](b40f599))
* fix rerun-triggers default
([#3403](#3403))
([4430e23](4430e23))
* fix typo 'safe' -&gt; 'save'
([#3384](#3384))
([7755861](7755861))
* mention code formatting in the contribution section
([#3431](#3431))
([e8682b7](e8682b7))
* remove duplicated 'functions'.
([#3356](#3356))
([7c595db](7c595db))
* update broken links documentation
([#3437](#3437))
([e3d0d88](e3d0d88))
* Updating contributing guidelines with new pixi dev setup
([#3415](#3415))
([8e95a12](8e95a12))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: snakemake-bot <snakemake-bot-admin@googlegroups.com>
kjohnsen pushed a commit to kjohnsen/snakemake that referenced this pull request Dec 15, 2025
<!--Add a description of your PR here-->
Fixes snakemake#3264 

### QC
<!-- Make sure that you can tick the boxes below. -->

* [ ] The PR contains a test case for the changes or the changes are
already covered by an existing test case.
* [x] The documentation (`docs/`) is updated to reflect the changes or
this is not necessary (e.g. if the change does neither modify the
language nor the behavior or functionalities of Snakemake).

Should I add a test case for this?

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added a dry run option for output deletion, allowing users to simulate
deletions before execution.
- Enhanced workflow commands to improve file and directory management
during output operations.

- **Bug Fixes**
- Corrected a file path reference in deletion validation to ensure more
reliable output handling.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: bayraktar1 <bayraktar.d@protonmail.com>
kjohnsen pushed a commit to kjohnsen/snakemake that referenced this pull request Dec 15, 2025
🤖 I have created a release *beep* *boop*
---


##
[9.0.0](snakemake/snakemake@v8.30.0...v9.0.0)
(2025-03-14)


### ⚠ BREAKING CHANGES

* Logging refactor & add LoggerPluginInterface
([snakemake#3107](snakemake#3107))

### Features

* [snakemake#3412](snakemake#3412) - keep
shadow folder of failed job if --keep-incomplete flag is set.
([snakemake#3430](snakemake#3430))
([22978c3](snakemake@22978c3))
* add flag --report-after-run to automatically generate the report after
a successfull workflow run
([snakemake#3428](snakemake#3428))
([b0a7f03](snakemake@b0a7f03))
* add flatten function to IO utils
([snakemake#3424](snakemake#3424))
([67fa392](snakemake@67fa392))
* add helper functions to parse input files
([snakemake#2918](snakemake#2918))
([63e45a7](snakemake@63e45a7))
* Add option to print redacted file names
([snakemake#3089](snakemake#3089))
([ba4d264](snakemake@ba4d264))
* add support for validation of polars dataframe and lazyframe
([snakemake#3262](snakemake#3262))
([c7473a6](snakemake@c7473a6))
* added support for rendering dag with mermaid js
([snakemake#3409](snakemake#3409))
([7bf8381](snakemake@7bf8381))
* adding --replace-workflow-config to fully replace workflow configs
(from config: directive) with --configfile, instead of merging them
([snakemake#3381](snakemake#3381))
([47504a0](snakemake@47504a0))
* Dynamic module name
([snakemake#3401](snakemake#3401))
([024dc32](snakemake@024dc32))
* Enable saving and reloading IOCache object
([snakemake#3386](snakemake#3386))
([c935953](snakemake@c935953))
* files added in rule params with workflow.source_path will be available
in used containers
([snakemake#3385](snakemake#3385))
([a6e45bf](snakemake@a6e45bf))
* Fix keep_local in storage directive and more freedom over remote
retrieval behaviour
([snakemake#3410](snakemake#3410))
([67b4739](snakemake@67b4739))
* inherit parameters of use rule and extend/replace individual items
them when using 'with' directive
([snakemake#3365](snakemake#3365))
([93e4b92](snakemake@93e4b92))
* Logging refactor & add LoggerPluginInterface
([snakemake#3107](snakemake#3107))
([86f1d6e](snakemake@86f1d6e))
* Maximal file size for checksums
([snakemake#3368](snakemake#3368))
([b039f8a](snakemake@b039f8a))
* Modernize package configuration using Pixi
([snakemake#3369](snakemake#3369))
([77992d8](snakemake@77992d8))
* multiext support for named input/output
([snakemake#3372](snakemake#3372))
([05e1378](snakemake@05e1378))
* optionally auto-group jobs via temp files in case of remote execution
([snakemake#3378](snakemake#3378))
([cc9bba2](snakemake@cc9bba2))


### Bug Fixes

* `--delete-all-output` ignores `--dry-run`
([snakemake#3265](snakemake#3265))
([23fef82](snakemake@23fef82))
* 3342 faster touch runs and warning messages for non-existing files
([snakemake#3398](snakemake#3398))
([cd9c3c3](snakemake@cd9c3c3))
* add default value to max-jobs-per-timespan
([snakemake#3043](snakemake#3043))
([2959abe](snakemake@2959abe))
* checkpoints inside modules are overwritten
([snakemake#3359](snakemake#3359))
([fba3ac7](snakemake@fba3ac7))
* Convert Path to IOFile
([snakemake#3405](snakemake#3405))
([c58684c](snakemake@c58684c))
* Do not perform storage object cleanup with --keep-storage-local-copies
set ([snakemake#3358](snakemake#3358))
([9a6d14b](snakemake@9a6d14b))
* edgecases of source deployment in case of remote execution
([snakemake#3396](snakemake#3396))
([5da13be](snakemake@5da13be))
* enhance error message formatting for strict DAG-building mode
([snakemake#3376](snakemake#3376))
([a1c39ee](snakemake@a1c39ee))
* fix bug in checkpoint handling that led to exceptions in case
checkpoint output was missing upon rerun
([snakemake#3423](snakemake#3423))
([8cf4a2f](snakemake@8cf4a2f))
* force check all required outputs
([snakemake#3341](snakemake#3341))
([495a4e7](snakemake@495a4e7))
* group job formatting
([snakemake#3442](snakemake#3442))
([f0b10a3](snakemake@f0b10a3))
* in remote jobs, upload storage in topological order such that
modification dates are preserved (e.g. in case of group jobs)
([snakemake#3377](snakemake#3377))
([eace08f](snakemake@eace08f))
* only skip eval when resource depends on input
([snakemake#3374](snakemake#3374))
([4574c92](snakemake@4574c92))
* Prevent execution of conda in apptainer when not explicitly requested
in software deployment method
([snakemake#3388](snakemake#3388))
([c43c5c0](snakemake@c43c5c0))
* print filenames with quotes around them in RuleException
([snakemake#3269](snakemake#3269))
([6baeda5](snakemake@6baeda5))
* Re-evaluation of free resources
([snakemake#3399](snakemake#3399))
([6371293](snakemake@6371293))
* ReadTheDocs layout issue due to src directory change
([snakemake#3419](snakemake#3419))
([695b127](snakemake@695b127))
* robustly escaping quotes in generated bash scripts (v2)
([snakemake#3297](snakemake#3297))
([snakemake#3389](snakemake#3389))
([58720bd](snakemake@58720bd))
* Show apptainer image URL in snakemake report
([snakemake#3407](snakemake#3407))
([45f0450](snakemake@45f0450))
* Update ReadTheDocs configuration for documentation build to use Pixi
([snakemake#3433](snakemake#3433))
([3f227a6](snakemake@3f227a6))


### Documentation

* Add pixi setup instructions to general use tutorial
([snakemake#3382](snakemake#3382))
([115e81b](snakemake@115e81b))
* fix contribution section heading levels, fix docs testing setup order
([snakemake#3360](snakemake#3360))
([051dc53](snakemake@051dc53))
* fix link to github.com/snakemake/poetry-snakemake-plugin
([snakemake#3436](snakemake#3436))
([ec6d97c](snakemake@ec6d97c))
* fix quoting
([snakemake#3394](snakemake#3394))
([b40f599](snakemake@b40f599))
* fix rerun-triggers default
([snakemake#3403](snakemake#3403))
([4430e23](snakemake@4430e23))
* fix typo 'safe' -&gt; 'save'
([snakemake#3384](snakemake#3384))
([7755861](snakemake@7755861))
* mention code formatting in the contribution section
([snakemake#3431](snakemake#3431))
([e8682b7](snakemake@e8682b7))
* remove duplicated 'functions'.
([snakemake#3356](snakemake#3356))
([7c595db](snakemake@7c595db))
* update broken links documentation
([snakemake#3437](snakemake#3437))
([e3d0d88](snakemake@e3d0d88))
* Updating contributing guidelines with new pixi dev setup
([snakemake#3415](snakemake#3415))
([8e95a12](snakemake@8e95a12))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: snakemake-bot <snakemake-bot-admin@googlegroups.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

Development

Successfully merging this pull request may close these issues.

--delete-all-output ignores --dry-run

3 participants