Skip to content

feat: Add option to print redacted file names#3089

Merged
johanneskoester merged 12 commits intosnakemake:mainfrom
MattMonk:main
Mar 14, 2025
Merged

feat: Add option to print redacted file names#3089
johanneskoester merged 12 commits intosnakemake:mainfrom
MattMonk:main

Conversation

@MattMonk
Copy link
Copy Markdown
Contributor

@MattMonk MattMonk commented Sep 17, 2024

Addresses issue #3087.

Adds some functionality so that if a file is a storage object, it can print a censored version of the query (print_query) rather than just the query itself incase it contains sensitive information in the query that should not be printed.

I suppose it would make sense to at least put something in the documentation (in snakemake-interface-storage-plugins maybe?) to explain that the safe_print method can be used (as an aside, I tested this with the _safe_to_print_url method of snakemake-storage-plugin-xrootd but figured that method name is probably not general enough so I can rename it for that plugin).

Note this also slightly changes the message when printing an input file as previously it would always print (retrieve from storage) even if retrieve=False.

QC

  • The PR contains a test case for the changes or the changes are already covered by an existing test case.
  • The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

Summary by CodeRabbit

  • New Features
    • Enabled configurable integration for xrootd-based storage, enhancing file handling workflows.
  • Refactor
    • Improved messages and output clarity for storage operations, offering more precise file processing instructions.
  • Tests
    • Added validations to confirm that sensitive credentials remain concealed.
  • Chores
    • Updated continuous integration triggers to restrict execution based on repository name.
    • Added a new dependency for enhanced xrootd storage capabilities.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Sep 17, 2024

📝 Walkthrough

Walkthrough

The pull request introduces a new storage provider configuration using "xrootd" in the Snakefile with corresponding parameters and a rule. A test function was added to verify that sensitive strings do not appear in the output. Additionally, the GitHub Actions workflow now includes a conditional check that restricts job execution based on the repository name. The dependency file is updated with a new package. Finally, several functions across the codebase have been modified to use print_query instead of query, with jobs.py further refining its logic for formatting file messages.

Changes

File(s) Change Summary
tests/test_censored_path/Snakefile Added new storage provider configuration (provider, username, password, url_decorator) and a new rule using storage.xrootd("root://somehost//path/to/file.root") with an associated shell command ("echo").
tests/tests.py Introduced a new test function test_censored_path that runs Snakemake on the Snakefile and asserts that sensitive strings are not present in the output. Additionally, modified test_keep_local to include more checks.
.github/workflows/docker-publish.yml Added a conditional statement (if: github.repository == 'snakemake/snakemake') in the update job to restrict its execution based on the repository name.
test-environment.yml Added the dependency snakemake-storage-plugin-xrootd >=0.2.0 under the pip section.
src/snakemake/dag.py
src/snakemake/io.py
Updated functions to replace usage of storage_object.query with storage_object.print_query in the DAG's summary method and the logging within IO methods.
src/snakemake/jobs.py Modified the format_file function to use a conditional structure for input files (choosing "retrieve from" or "keep remote on" based on storage_object.retrieve) and updated it to use print_query instead of query.

Sequence Diagram(s)

sequenceDiagram
    participant GitHub as GitHub
    participant Runner as Workflow Runner
    participant Job as Update Job

    GitHub->>Runner: Trigger workflow on event
    Runner->>Runner: Evaluate "github.repository == 'snakemake/snakemake'"
    alt Condition true
        Runner->>Job: Execute update job
    else Condition false
        Runner->>Job: Skip update job
    end
Loading
sequenceDiagram
    participant Caller as Caller
    participant FF as format_file()
    participant SO as Storage Object

    Caller->>FF: Call format_file(file f)
    alt f is input
      alt storage.retrieve true
          FF-->SO: Call print_query()
          FF-->>Caller: Return "retrieve from" + query
      else storage.retrieve false
          FF-->SO: Call print_query()
          FF-->>Caller: Return "keep remote on" + query
      end
    else f is output
        FF-->SO: Call print_query()
        FF-->>Caller: Return "send to" + query
    end
Loading

Possibly related PRs

Suggested reviewers

  • johanneskoester

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bba9c19 and 98df123.

📒 Files selected for processing (1)
  • tests/tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/tests.py
⏰ Context from checks skipped due to timeout of 90000ms (16)
  • GitHub Check: tests (10, windows-latest, py312)
  • GitHub Check: tests (10, windows-latest, py311)
  • GitHub Check: tests (10, ubuntu-latest, py312)
  • GitHub Check: tests (10, ubuntu-latest, py311)
  • GitHub Check: tests (3, windows-latest, py311)
  • GitHub Check: tests (3, ubuntu-latest, py312)
  • GitHub Check: tests (3, ubuntu-latest, py311)
  • GitHub Check: tests (2, windows-latest, py312)
  • GitHub Check: tests (2, windows-latest, py311)
  • GitHub Check: tests (2, ubuntu-latest, py312)
  • GitHub Check: tests (2, ubuntu-latest, py311)
  • GitHub Check: tests (1, windows-latest, py312)
  • GitHub Check: tests (1, windows-latest, py311)
  • GitHub Check: tests (1, ubuntu-latest, py312)
  • GitHub Check: tests (1, ubuntu-latest, py311)
  • GitHub Check: apidocs

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@MattMonk
Copy link
Copy Markdown
Contributor Author

I've added a test and this also now relies on PR snakemake/snakemake-interface-storage-plugins#54 to happen first.

@MattMonk MattMonk marked this pull request as ready for review September 18, 2024 08:57
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Outside diff range and nitpick comments (1)
snakemake/jobs.py (1)

78-81: Consider using a ternary operator to simplify the code.

As per the static analysis hint, you can replace the if-else block with a ternary operator to make the code more concise:

-        if is_input:
-            if f.storage_object.retrieve:
-                phrase = "retrieve from"
-            else:
-                phrase = "keep remote on"
+        phrase = "retrieve from" if is_input and f.storage_object.retrieve else "keep remote on"

This change simplifies the logic without altering the functionality.

Tools
Ruff

78-81: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between e9ce4a3 and fe21227.

Files selected for processing (3)
  • snakemake/jobs.py (1 hunks)
  • tests/test_censored_path/Snakefile (1 hunks)
  • tests/tests.py (1 hunks)
Additional context used
Path-based instructions (2)
tests/tests.py (1)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

snakemake/jobs.py (1)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Ruff
snakemake/jobs.py

78-81: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

Additional comments not posted (3)
tests/tests.py (1)

2263-2275: Looks good, nice test!

The new test_censored_path function is a valuable addition to the test suite. It verifies that sensitive information like passwords and parameter names/values are properly censored in the Snakemake output. This helps prevent unintentional leakage of sensitive data.

The test logic looks solid:

  • It runs Snakemake with a specific Snakefile and captures the stdout.
  • It makes assertions to ensure the absence of sensitive strings in the captured output.
  • Using the -n dry-run flag avoids actual workflow execution.

Overall, this test should help catch and prevent bugs related to improper censoring of sensitive information. Great work!

snakemake/jobs.py (2)

77-85: Excellent work on improving the clarity of the output!

The changes introduce a more nuanced handling of the is_input parameter when dealing with storage objects. By differentiating between the scenarios when the storage object is set to retrieve data and when it is not, the output provides more specific information about the action being taken. This enhances the semantic clarity, allowing users to better understand the operations being performed.

Additionally, utilizing f.storage_object.print_query instead of f.storage_object.query further refines the information presented.

Overall, these changes significantly improve the readability and understandability of the function's output. Well done!

Tools
Ruff

78-81: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)


Line range hint 1-1: Function Reason.__str__ reviewed.

The Reason.__str__ function was reviewed as part of the comprehensive code analysis. However, no changes were made to this function in the provided code diff. The existing implementation appears to be functioning as intended, generating an appropriate string representation of the Reason object based on its attributes.

Tools
Ruff

78-81: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
tests/test_censored_path/Snakefile (1)

1-5: ⚠️ Potential issue

Avoid hardcoding sensitive information in the Snakefile.

This code hardcodes credentials directly in the Snakefile, which is a security risk. Even though this is a test file, it sets a bad example for users.

Consider using environment variables or a separate configuration file for sensitive information:

-    username="my_username",
-    password="my_password",
+    username=os.environ.get("XROOTD_USERNAME", "my_username"),
+    password=os.environ.get("XROOTD_PASSWORD", "my_password"),
🧹 Nitpick comments (3)
tests/test_censored_path/Snakefile (2)

5-5: Add a comment explaining the purpose of the URL decorator.

The URL decorator appends query parameters to the URL, but it's not immediately clear why this is necessary or what the parameters are used for.

Add a descriptive comment explaining the purpose of the URL decorator:

+    # Append query parameters needed for authentication or special handling by the xrootd provider
     url_decorator=lambda x: x + "?param_name=param_value"

7-9: Use a variable for the output path and add more descriptive shell command.

The hardcoded output path makes the rule less flexible and the empty echo command doesn't illustrate the purpose of the rule.

Consider using a variable for the output path and implementing a more descriptive shell command:

 rule:
-    output: storage.xrootd("root://somehost//path/to/file.root")
-    shell: "echo"
+    output: 
+        out = storage.xrootd("root://somehost//path/to/file.root")
+    shell: 
+        "echo 'This rule demonstrates the censoring of sensitive information in storage paths' > /dev/null"
snakemake/jobs.py (1)

81-89: Good enhancement to storage object output labeling.

The changes improve the descriptiveness of storage object references in the output by differentiating between different types of storage operations based on the input/output context and the state of the storage object.

The nested conditional could be simplified using a ternary operator for better readability:

 if is_input:
-    if f.storage_object.retrieve:
-        phrase = "retrieve from"
-    else:
-        phrase = "keep remote on"
+    phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"
 else:
     phrase = "send to"
🧰 Tools
🪛 Ruff (0.8.2)

82-85: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe21227 and 5dcf1eb.

📒 Files selected for processing (3)
  • snakemake/jobs.py (1 hunks)
  • tests/test_censored_path/Snakefile (1 hunks)
  • tests/tests.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • tests/tests.py
  • snakemake/jobs.py
🪛 Ruff (0.8.2)
snakemake/jobs.py

82-85: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

⏰ Context from checks skipped due to timeout of 90000ms (41)
  • GitHub Check: testing (10, 3.12, dash)
  • GitHub Check: testing (10, 3.12, bash)
  • GitHub Check: testing (10, 3.11, bash)
  • GitHub Check: testing (9, 3.12, dash)
  • GitHub Check: testing (9, 3.12, bash)
  • GitHub Check: testing (9, 3.11, bash)
  • GitHub Check: testing (8, 3.12, dash)
  • GitHub Check: testing (8, 3.12, bash)
  • GitHub Check: testing (8, 3.11, bash)
  • GitHub Check: testing (7, 3.12, dash)
  • GitHub Check: testing (7, 3.12, bash)
  • GitHub Check: testing (7, 3.11, bash)
  • GitHub Check: testing (6, 3.12, dash)
  • GitHub Check: testing (6, 3.12, bash)
  • GitHub Check: testing (6, 3.11, bash)
  • GitHub Check: testing (5, 3.12, dash)
  • GitHub Check: testing (5, 3.12, bash)
  • GitHub Check: testing (5, 3.11, bash)
  • GitHub Check: testing (4, 3.12, dash)
  • GitHub Check: testing (4, 3.12, bash)
  • GitHub Check: testing (4, 3.11, bash)
  • GitHub Check: testing-windows (10)
  • GitHub Check: testing (3, 3.12, dash)
  • GitHub Check: testing-windows (9)
  • GitHub Check: testing (3, 3.12, bash)
  • GitHub Check: testing-windows (8)
  • GitHub Check: testing (3, 3.11, bash)
  • GitHub Check: testing-windows (7)
  • GitHub Check: testing (2, 3.12, dash)
  • GitHub Check: testing-windows (6)
  • GitHub Check: testing (2, 3.12, bash)
  • GitHub Check: testing-windows (5)
  • GitHub Check: testing (2, 3.11, bash)
  • GitHub Check: testing-windows (4)
  • GitHub Check: testing (1, 3.12, dash)
  • GitHub Check: testing-windows (3)
  • GitHub Check: testing (1, 3.12, bash)
  • GitHub Check: testing-windows (2)
  • GitHub Check: testing (1, 3.11, bash)
  • GitHub Check: testing-windows (1)
  • GitHub Check: apidocs
🔇 Additional comments (2)
tests/tests.py (1)

2344-2357: Well-implemented test for path censoring functionality.

The test effectively verifies that sensitive information is not exposed in the Snakemake output when using storage objects. It checks for the absence of credentials and query parameters in the command output.

snakemake/jobs.py (1)

88-88: Important security enhancement: Using print_query instead of query.

This change is crucial for preventing sensitive information leakage. The print_query property likely implements the mechanism to censor or redact sensitive information such as credentials and query parameters from storage object URLs before they're displayed in logs or output.

@MattMonk
Copy link
Copy Markdown
Contributor Author

I think this requires snakemake/snakemake-interface-storage-plugins#54 to be merged first before the CI tests can run as snakemake/snakemake-interface-storage-plugins#54 adds the print_query method to the storage object

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
snakemake/jobs.py (1)

81-92: Improve security by implementing redacted query printing logic.

This change adds better handling of storage objects by:

  1. Using more explicit phrasing based on the input type and retrieval flag
  2. Adding support for safely printing storage queries by using the provider's safe_print method when available

The code correctly handles the case where the provider may not have implemented the safe_print method by falling back to the original query.

Consider using a ternary operator for the phrase selection to make the code more concise:

-        if is_input:
-            if f.storage_object.retrieve:
-                phrase = "retrieve from"
-            else:
-                phrase = "keep remote on"
-        else:
-            phrase = "send to"
+        phrase = "send to" if not is_input else ("retrieve from" if f.storage_object.retrieve else "keep remote on")
🧰 Tools
🪛 Ruff (0.8.2)

82-85: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7783b59 and b6443ca.

📒 Files selected for processing (6)
  • snakemake/jobs.py (1 hunks)
  • snakemake/jobs.py (1 hunks)
  • tests/test_censored_path/Snakefile (1 hunks)
  • tests/tests.py (1 hunks)
  • snakemake/dag.py (1 hunks)
  • snakemake/io.py (4 hunks)
✅ Files skipped from review due to trivial changes (1)
  • snakemake/io.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/tests.py
  • tests/test_censored_path/Snakefile
  • snakemake/jobs.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • snakemake/dag.py
  • snakemake/jobs.py
🪛 Ruff (0.8.2)
snakemake/jobs.py

82-85: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

🔇 Additional comments (1)
snakemake/dag.py (1)

2527-2527: Good security enhancement using print_query for storage objects.

This change improves security by using the print_query method instead of directly exposing the raw query, which might contain sensitive information like credentials or private paths.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
snakemake/jobs.py (1)

81-89: Good enhancement of storage-related messaging with safe printing

The implementation now properly distinguishes between different storage operations ("retrieve from", "keep remote on", "send to") based on file type and retrieval settings. The use of print_query instead of the previous query attribute helps ensure sensitive information in file paths is not leaked.

Consider using a ternary operator for slightly more concise code:

if is_input:
-    if f.storage_object.retrieve:
-        phrase = "retrieve from"
-    else:
-        phrase = "keep remote on"
+    phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"
else:
    phrase = "send to"
🧰 Tools
🪛 Ruff (0.8.2)

82-85: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 578f3be and f6cf30b.

📒 Files selected for processing (3)
  • snakemake/dag.py (1 hunks)
  • snakemake/jobs.py (1 hunks)
  • tests/tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/tests.py
  • snakemake/dag.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • snakemake/jobs.py
🪛 Ruff (0.8.2)
snakemake/jobs.py

82-85: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

⏰ Context from checks skipped due to timeout of 90000ms (29)
  • GitHub Check: testing (10, 3.12, bash)
  • GitHub Check: testing (10, 3.11, bash)
  • GitHub Check: testing (9, 3.11, bash)
  • GitHub Check: testing (8, 3.12, bash)
  • GitHub Check: testing (8, 3.11, bash)
  • GitHub Check: testing (7, 3.12, bash)
  • GitHub Check: testing (7, 3.11, bash)
  • GitHub Check: testing (6, 3.12, bash)
  • GitHub Check: testing-windows (10)
  • GitHub Check: testing (6, 3.11, bash)
  • GitHub Check: testing-windows (9)
  • GitHub Check: testing (5, 3.12, bash)
  • GitHub Check: testing (5, 3.11, bash)
  • GitHub Check: testing-windows (8)
  • GitHub Check: testing-windows (7)
  • GitHub Check: testing (4, 3.12, bash)
  • GitHub Check: testing-windows (6)
  • GitHub Check: testing (4, 3.11, bash)
  • GitHub Check: testing-windows (5)
  • GitHub Check: testing (3, 3.12, bash)
  • GitHub Check: testing-windows (4)
  • GitHub Check: testing (3, 3.11, bash)
  • GitHub Check: testing-windows (3)
  • GitHub Check: testing (2, 3.12, bash)
  • GitHub Check: testing-windows (2)
  • GitHub Check: testing (2, 3.11, bash)
  • GitHub Check: testing-windows (1)
  • GitHub Check: testing (1, 3.12, bash)
  • GitHub Check: testing (1, 3.11, bash)

Copy link
Copy Markdown
Contributor

@johanneskoester johanneskoester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
.github/workflows/docker-publish.yml (1)

10-10: Conditional Execution of Docker Publish Job

The new condition if: github.repository == 'snakemake/snakemake' effectively restricts the update job to run only when the repository matches the expected string. This is a useful safeguard that prevents accidental deployments from forks or different repositories.

Consider documenting this condition in a comment so that future maintainers understand its purpose at a glance.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9c1288b and eb6fadc.

📒 Files selected for processing (1)
  • .github/workflows/docker-publish.yml (1 hunks)

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
snakemake/jobs.py (1)

81-89: Good addition of detailed storage operation phrases

The implementation now provides more descriptive phrases for different storage operations, enhancing clarity by distinguishing between retrieving files and keeping them remote. The use of print_query instead of the direct query ensures sensitive information can be redacted.

You could simplify the conditional logic on lines 82-85 using a ternary operator:

-        if is_input:
-            if f.storage_object.retrieve:
-                phrase = "retrieve from"
-            else:
-                phrase = "keep remote on"
-        else:
-            phrase = "send to"
+        if is_input:
+            phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"
+        else:
+            phrase = "send to"
🧰 Tools
🪛 Ruff (0.8.2)

82-85: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bf848c1 and 3ef352a.

📒 Files selected for processing (8)
  • .github/workflows/docker-publish.yml (1 hunks)
  • setup.cfg (1 hunks)
  • snakemake/dag.py (1 hunks)
  • snakemake/io.py (4 hunks)
  • snakemake/jobs.py (1 hunks)
  • test-environment.yml (1 hunks)
  • tests/test_censored_path/Snakefile (1 hunks)
  • tests/tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
  • test-environment.yml
  • tests/test_censored_path/Snakefile
  • .github/workflows/docker-publish.yml
  • snakemake/io.py
  • setup.cfg
  • tests/tests.py
  • snakemake/dag.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • snakemake/jobs.py
🪛 Ruff (0.8.2)
snakemake/jobs.py

82-85: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

⏰ Context from checks skipped due to timeout of 90000ms (19)
  • GitHub Check: testing (10, 3.12, bash)
  • GitHub Check: testing (10, 3.11, bash)
  • GitHub Check: testing-windows (10)
  • GitHub Check: testing-windows (9)
  • GitHub Check: testing-windows (8)
  • GitHub Check: testing (4, 3.11, bash)
  • GitHub Check: testing-windows (7)
  • GitHub Check: testing-windows (6)
  • GitHub Check: testing (3, 3.11, bash)
  • GitHub Check: testing-windows (5)
  • GitHub Check: testing (2, 3.12, bash)
  • GitHub Check: testing-windows (4)
  • GitHub Check: testing (2, 3.11, bash)
  • GitHub Check: testing-windows (3)
  • GitHub Check: testing (1, 3.12, bash)
  • GitHub Check: testing-windows (2)
  • GitHub Check: testing (1, 3.11, bash)
  • GitHub Check: testing-windows (1)
  • GitHub Check: apidocs

@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/snakemake/jobs.py (1)

82-90: Improved file storage message clarity and security

The code now distinguishes between different storage operations by using different phrases based on whether an input file is being retrieved or kept remote. It also uses print_query instead of the original query to protect sensitive information, which aligns with the PR objective.

You could make the code more concise by using a ternary operator:

-        if is_input:
-            if f.storage_object.retrieve:
-                phrase = "retrieve from"
-            else:
-                phrase = "keep remote on"
-        else:
-            phrase = "send to"
+        phrase = "send to" if not is_input else ("retrieve from" if f.storage_object.retrieve else "keep remote on")
🧰 Tools
🪛 Ruff (0.8.2)

83-86: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5cde6c6 and fef8057.

📒 Files selected for processing (7)
  • .github/workflows/docker-publish.yml (1 hunks)
  • src/snakemake/dag.py (1 hunks)
  • src/snakemake/io.py (4 hunks)
  • src/snakemake/jobs.py (1 hunks)
  • test-environment.yml (1 hunks)
  • tests/test_censored_path/Snakefile (1 hunks)
  • tests/tests.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • src/snakemake/io.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • .github/workflows/docker-publish.yml
  • tests/test_censored_path/Snakefile
  • test-environment.yml
  • tests/tests.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • src/snakemake/dag.py
  • src/snakemake/jobs.py
🪛 Ruff (0.8.2)
src/snakemake/jobs.py

83-86: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

⏰ Context from checks skipped due to timeout of 90000ms (40)
  • GitHub Check: tests (10, windows-latest, py312, bash)
  • GitHub Check: tests (10, windows-latest, py311, bash)
  • GitHub Check: tests (10, ubuntu-latest, py312, bash)
  • GitHub Check: tests (10, ubuntu-latest, py311, bash)
  • GitHub Check: tests (9, windows-latest, py312, bash)
  • GitHub Check: tests (9, windows-latest, py311, bash)
  • GitHub Check: tests (9, ubuntu-latest, py312, bash)
  • GitHub Check: tests (9, ubuntu-latest, py311, bash)
  • GitHub Check: tests (8, windows-latest, py312, bash)
  • GitHub Check: tests (8, windows-latest, py311, bash)
  • GitHub Check: tests (8, ubuntu-latest, py312, bash)
  • GitHub Check: tests (8, ubuntu-latest, py311, bash)
  • GitHub Check: tests (7, windows-latest, py312, bash)
  • GitHub Check: tests (7, windows-latest, py311, bash)
  • GitHub Check: tests (7, ubuntu-latest, py312, bash)
  • GitHub Check: tests (7, ubuntu-latest, py311, bash)
  • GitHub Check: tests (6, windows-latest, py312, bash)
  • GitHub Check: tests (6, windows-latest, py311, bash)
  • GitHub Check: tests (6, ubuntu-latest, py312, bash)
  • GitHub Check: tests (6, ubuntu-latest, py311, bash)
  • GitHub Check: tests (5, windows-latest, py312, bash)
  • GitHub Check: tests (5, windows-latest, py311, bash)
  • GitHub Check: tests (5, ubuntu-latest, py312, bash)
  • GitHub Check: tests (5, ubuntu-latest, py311, bash)
  • GitHub Check: tests (4, windows-latest, py312, bash)
  • GitHub Check: tests (4, windows-latest, py311, bash)
  • GitHub Check: tests (4, ubuntu-latest, py312, bash)
  • GitHub Check: tests (4, ubuntu-latest, py311, bash)
  • GitHub Check: tests (3, windows-latest, py312, bash)
  • GitHub Check: tests (3, windows-latest, py311, bash)
  • GitHub Check: tests (3, ubuntu-latest, py312, bash)
  • GitHub Check: tests (3, ubuntu-latest, py311, bash)
  • GitHub Check: tests (2, windows-latest, py312, bash)
  • GitHub Check: tests (2, windows-latest, py311, bash)
  • GitHub Check: tests (2, ubuntu-latest, py312, bash)
  • GitHub Check: tests (2, ubuntu-latest, py311, bash)
  • GitHub Check: tests (1, windows-latest, py312, bash)
  • GitHub Check: tests (1, windows-latest, py311, bash)
  • GitHub Check: tests (1, ubuntu-latest, py312, bash)
  • GitHub Check: tests (1, ubuntu-latest, py311, bash)
🔇 Additional comments (1)
src/snakemake/dag.py (1)

2580-2582: Excellent security improvement by using print_query instead of query.

This change properly implements the redaction of potentially sensitive information in storage object queries when generating summaries. This aligns perfectly with the PR objective of protecting sensitive data while still providing useful output.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/snakemake/jobs.py (1)

83-92: Improved storage object messaging logic

The enhanced conditional logic now provides more precise messaging based on the storage object's retrieve attribute. When retrieve is false, the phrase "keep remote on" correctly conveys that the file will remain in remote storage rather than being retrieved.

The change from query to print_query also implements the core feature of this PR, allowing for redacted file names when sensitive information is present in storage paths.

Consider using a ternary operator for slightly improved readability:

- if f.storage_object.retrieve:
-     phrase = "retrieve from"
- else:
-     phrase = "keep remote on"
+ phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"
🧰 Tools
🪛 Ruff (0.8.2)

84-87: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fef8057 and bd1588a.

📒 Files selected for processing (7)
  • .github/workflows/docker-publish.yml (1 hunks)
  • src/snakemake/dag.py (1 hunks)
  • src/snakemake/io.py (4 hunks)
  • src/snakemake/jobs.py (1 hunks)
  • test-environment.yml (1 hunks)
  • tests/test_censored_path/Snakefile (1 hunks)
  • tests/tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • test-environment.yml
  • .github/workflows/docker-publish.yml
  • tests/tests.py
  • tests/test_censored_path/Snakefile
  • src/snakemake/dag.py
  • src/snakemake/io.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • src/snakemake/jobs.py
🪛 Ruff (0.8.2)
src/snakemake/jobs.py

84-87: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

⏰ Context from checks skipped due to timeout of 90000ms (21)
  • GitHub Check: tests (10, windows-latest, py312, bash)
  • GitHub Check: tests (10, windows-latest, py311, bash)
  • GitHub Check: tests (10, ubuntu-latest, py312, bash)
  • GitHub Check: tests (10, ubuntu-latest, py311, bash)
  • GitHub Check: tests (5, ubuntu-latest, py312, bash)
  • GitHub Check: tests (5, ubuntu-latest, py311, bash)
  • GitHub Check: tests (4, windows-latest, py312, bash)
  • GitHub Check: tests (4, windows-latest, py311, bash)
  • GitHub Check: tests (4, ubuntu-latest, py312, bash)
  • GitHub Check: tests (4, ubuntu-latest, py311, bash)
  • GitHub Check: tests (3, windows-latest, py312, bash)
  • GitHub Check: tests (3, windows-latest, py311, bash)
  • GitHub Check: tests (3, ubuntu-latest, py312, bash)
  • GitHub Check: tests (3, ubuntu-latest, py311, bash)
  • GitHub Check: tests (2, windows-latest, py312, bash)
  • GitHub Check: tests (2, windows-latest, py311, bash)
  • GitHub Check: tests (2, ubuntu-latest, py312, bash)
  • GitHub Check: tests (2, ubuntu-latest, py311, bash)
  • GitHub Check: tests (1, windows-latest, py312, bash)
  • GitHub Check: tests (1, windows-latest, py311, bash)
  • GitHub Check: tests (1, ubuntu-latest, py311, bash)

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/snakemake/jobs.py (1)

83-91: Improved storage file messaging and security enhancement.

The changes provide more accurate messaging for files based on the actual operation (retrieve vs. keep remote) and implement the censoring feature by using print_query instead of query for storage objects. This aligns with the PR objective to protect sensitive information in storage query strings.

For cleaner code, consider using a ternary operator in the conditional statement:

-        if is_input:
-            if f.storage_object.retrieve:
-                phrase = "retrieve from"
-            else:
-                phrase = "keep remote on"
-        else:
-            phrase = "send to"
+        phrase = "send to" if not is_input else ("retrieve from" if f.storage_object.retrieve else "keep remote on")
🧰 Tools
🪛 Ruff (0.8.2)

84-87: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bd1588a and dade26e.

📒 Files selected for processing (7)
  • .github/workflows/docker-publish.yml (1 hunks)
  • src/snakemake/dag.py (1 hunks)
  • src/snakemake/io.py (4 hunks)
  • src/snakemake/jobs.py (1 hunks)
  • test-environment.yml (1 hunks)
  • tests/test_censored_path/Snakefile (1 hunks)
  • tests/tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • test-environment.yml
  • tests/tests.py
  • tests/test_censored_path/Snakefile
  • .github/workflows/docker-publish.yml
  • src/snakemake/dag.py
  • src/snakemake/io.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • src/snakemake/jobs.py
🪛 Ruff (0.8.2)
src/snakemake/jobs.py

84-87: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

⏰ Context from checks skipped due to timeout of 90000ms (38)
  • GitHub Check: tests (10, windows-latest, py312, bash)
  • GitHub Check: tests (10, windows-latest, py311, bash)
  • GitHub Check: tests (10, ubuntu-latest, py312, bash)
  • GitHub Check: tests (10, ubuntu-latest, py311, bash)
  • GitHub Check: tests (9, windows-latest, py312, bash)
  • GitHub Check: tests (9, windows-latest, py311, bash)
  • GitHub Check: tests (9, ubuntu-latest, py312, bash)
  • GitHub Check: tests (9, ubuntu-latest, py311, bash)
  • GitHub Check: tests (8, windows-latest, py312, bash)
  • GitHub Check: tests (8, windows-latest, py311, bash)
  • GitHub Check: tests (8, ubuntu-latest, py312, bash)
  • GitHub Check: tests (8, ubuntu-latest, py311, bash)
  • GitHub Check: tests (7, windows-latest, py312, bash)
  • GitHub Check: tests (7, windows-latest, py311, bash)
  • GitHub Check: tests (7, ubuntu-latest, py312, bash)
  • GitHub Check: tests (7, ubuntu-latest, py311, bash)
  • GitHub Check: tests (6, windows-latest, py312, bash)
  • GitHub Check: tests (6, windows-latest, py311, bash)
  • GitHub Check: tests (6, ubuntu-latest, py312, bash)
  • GitHub Check: tests (6, ubuntu-latest, py311, bash)
  • GitHub Check: tests (5, windows-latest, py312, bash)
  • GitHub Check: tests (5, windows-latest, py311, bash)
  • GitHub Check: tests (5, ubuntu-latest, py312, bash)
  • GitHub Check: tests (5, ubuntu-latest, py311, bash)
  • GitHub Check: tests (4, windows-latest, py312, bash)
  • GitHub Check: tests (4, windows-latest, py311, bash)
  • GitHub Check: tests (4, ubuntu-latest, py312, bash)
  • GitHub Check: tests (4, ubuntu-latest, py311, bash)
  • GitHub Check: tests (3, windows-latest, py312, bash)
  • GitHub Check: tests (3, windows-latest, py311, bash)
  • GitHub Check: tests (3, ubuntu-latest, py312, bash)
  • GitHub Check: tests (3, ubuntu-latest, py311, bash)
  • GitHub Check: tests (2, windows-latest, py312, bash)
  • GitHub Check: tests (2, windows-latest, py311, bash)
  • GitHub Check: tests (2, ubuntu-latest, py312, bash)
  • GitHub Check: tests (2, ubuntu-latest, py311, bash)
  • GitHub Check: tests (1, ubuntu-latest, py312, bash)
  • GitHub Check: tests (1, ubuntu-latest, py311, bash)

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/snakemake/jobs.py (1)

85-93: Improved logic for handling storage object messages

The changes enhance the clarity of storage-related file messages in logs by adding a more descriptive phrase based on the file's role and retrieval status. Using print_query instead of query provides the ability to redact sensitive information in the output.

Consider simplifying the conditional logic using a ternary operator as suggested by your static analysis tool:

-        if is_input:
-            if f.storage_object.retrieve:
-                phrase = "retrieve from"
-            else:
-                phrase = "keep remote on"
-        else:
-            phrase = "send to"
+        phrase = "send to" if not is_input else ("retrieve from" if f.storage_object.retrieve else "keep remote on")

Also, consider adding a brief comment explaining the redaction purpose of print_query vs query.

🧰 Tools
🪛 Ruff (0.8.2)

86-89: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 31e48d2 and fea8976.

📒 Files selected for processing (7)
  • .github/workflows/docker-publish.yml (1 hunks)
  • src/snakemake/dag.py (1 hunks)
  • src/snakemake/io.py (4 hunks)
  • src/snakemake/jobs.py (1 hunks)
  • test-environment.yml (1 hunks)
  • tests/test_censored_path/Snakefile (1 hunks)
  • tests/tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • test-environment.yml
  • tests/tests.py
  • .github/workflows/docker-publish.yml
  • tests/test_censored_path/Snakefile
  • src/snakemake/dag.py
  • src/snakemake/io.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • src/snakemake/jobs.py
🪛 Ruff (0.8.2)
src/snakemake/jobs.py

86-89: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/snakemake/jobs.py (1)

85-93: Good improvement to messaging based on the retrieve attribute.

The changes improve the logic by differentiating between retrieval operations and remote operations when printing messages about storage objects. This addresses the issue where it would always indicate files were being retrieved regardless of the actual operation.

The code now uses print_query instead of query, which will help protect sensitive information in the output.

Consider using a ternary operator to make the code more concise:

-        if is_input:
-            if f.storage_object.retrieve:
-                phrase = "retrieve from"
-            else:
-                phrase = "keep remote on"
-        else:
-            phrase = "send to"
+        phrase = "send to" if not is_input else ("retrieve from" if f.storage_object.retrieve else "keep remote on")
🧰 Tools
🪛 Ruff (0.8.2)

86-89: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 94b366f and bba9c19.

⛔ Files ignored due to path filters (1)
  • pyproject.toml is excluded by !pyproject.toml
📒 Files selected for processing (7)
  • .github/workflows/docker-publish.yml (1 hunks)
  • src/snakemake/dag.py (1 hunks)
  • src/snakemake/io.py (4 hunks)
  • src/snakemake/jobs.py (1 hunks)
  • test-environment.yml (1 hunks)
  • tests/test_censored_path/Snakefile (1 hunks)
  • tests/tests.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • test-environment.yml
  • .github/workflows/docker-publish.yml
  • src/snakemake/dag.py
  • src/snakemake/io.py
  • tests/test_censored_path/Snakefile
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • src/snakemake/jobs.py
  • tests/tests.py
🪛 Ruff (0.8.2)
src/snakemake/jobs.py

86-89: Use ternary operator phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on" instead of if-else-block

Replace if-else-block with phrase = "retrieve from" if f.storage_object.retrieve else "keep remote on"

(SIM108)

tests/tests.py

2198-2198: SyntaxError: Expected a statement


2198-2198: SyntaxError: Expected a statement


2198-2198: SyntaxError: Expected a statement


2198-2198: SyntaxError: Expected a statement

⏰ Context from checks skipped due to timeout of 90000ms (31)
  • GitHub Check: tests (10, windows-latest, py312)
  • GitHub Check: tests (10, windows-latest, py311)
  • GitHub Check: tests (9, windows-latest, py312)
  • GitHub Check: tests (9, windows-latest, py311)
  • GitHub Check: tests (9, ubuntu-latest, py312)
  • GitHub Check: tests (9, ubuntu-latest, py311)
  • GitHub Check: tests (8, windows-latest, py312)
  • GitHub Check: tests (8, windows-latest, py311)
  • GitHub Check: tests (8, ubuntu-latest, py312)
  • GitHub Check: tests (8, ubuntu-latest, py311)
  • GitHub Check: tests (7, windows-latest, py312)
  • GitHub Check: tests (7, windows-latest, py311)
  • GitHub Check: tests (7, ubuntu-latest, py312)
  • GitHub Check: tests (7, ubuntu-latest, py311)
  • GitHub Check: tests (6, windows-latest, py312)
  • GitHub Check: tests (6, windows-latest, py311)
  • GitHub Check: tests (6, ubuntu-latest, py312)
  • GitHub Check: tests (6, ubuntu-latest, py311)
  • GitHub Check: tests (5, windows-latest, py312)
  • GitHub Check: tests (5, windows-latest, py311)
  • GitHub Check: tests (5, ubuntu-latest, py312)
  • GitHub Check: tests (5, ubuntu-latest, py311)
  • GitHub Check: tests (4, windows-latest, py312)
  • GitHub Check: tests (4, windows-latest, py311)
  • GitHub Check: tests (3, windows-latest, py312)
  • GitHub Check: tests (3, windows-latest, py311)
  • GitHub Check: tests (2, windows-latest, py312)
  • GitHub Check: tests (2, windows-latest, py311)
  • GitHub Check: tests (1, windows-latest, py312)
  • GitHub Check: tests (1, windows-latest, py311)
  • GitHub Check: apidocs
🔇 Additional comments (1)
tests/tests.py (1)

2291-2304: Properly tests the new censored path functionality.

The new test ensures that sensitive data like passwords and parameters don't appear in the output when running Snakemake on a file with storage configuration. It appropriately captures both stdout and stderr for verification.

@johanneskoester johanneskoester merged commit ba4d264 into snakemake:main Mar 14, 2025
68 checks passed
@github-project-automation github-project-automation bot moved this from In progress to Done in Snakemake Hackathon March 2025 Mar 14, 2025
johanneskoester pushed a commit that referenced this pull request Mar 14, 2025
🤖 I have created a release *beep* *boop*
---


##
[9.0.0](v8.30.0...v9.0.0)
(2025-03-14)


### ⚠ BREAKING CHANGES

* Logging refactor & add LoggerPluginInterface
([#3107](#3107))

### Features

* [#3412](#3412) - keep
shadow folder of failed job if --keep-incomplete flag is set.
([#3430](#3430))
([22978c3](22978c3))
* add flag --report-after-run to automatically generate the report after
a successfull workflow run
([#3428](#3428))
([b0a7f03](b0a7f03))
* add flatten function to IO utils
([#3424](#3424))
([67fa392](67fa392))
* add helper functions to parse input files
([#2918](#2918))
([63e45a7](63e45a7))
* Add option to print redacted file names
([#3089](#3089))
([ba4d264](ba4d264))
* add support for validation of polars dataframe and lazyframe
([#3262](#3262))
([c7473a6](c7473a6))
* added support for rendering dag with mermaid js
([#3409](#3409))
([7bf8381](7bf8381))
* adding --replace-workflow-config to fully replace workflow configs
(from config: directive) with --configfile, instead of merging them
([#3381](#3381))
([47504a0](47504a0))
* Dynamic module name
([#3401](#3401))
([024dc32](024dc32))
* Enable saving and reloading IOCache object
([#3386](#3386))
([c935953](c935953))
* files added in rule params with workflow.source_path will be available
in used containers
([#3385](#3385))
([a6e45bf](a6e45bf))
* Fix keep_local in storage directive and more freedom over remote
retrieval behaviour
([#3410](#3410))
([67b4739](67b4739))
* inherit parameters of use rule and extend/replace individual items
them when using 'with' directive
([#3365](#3365))
([93e4b92](93e4b92))
* Logging refactor & add LoggerPluginInterface
([#3107](#3107))
([86f1d6e](86f1d6e))
* Maximal file size for checksums
([#3368](#3368))
([b039f8a](b039f8a))
* Modernize package configuration using Pixi
([#3369](#3369))
([77992d8](77992d8))
* multiext support for named input/output
([#3372](#3372))
([05e1378](05e1378))
* optionally auto-group jobs via temp files in case of remote execution
([#3378](#3378))
([cc9bba2](cc9bba2))


### Bug Fixes

* `--delete-all-output` ignores `--dry-run`
([#3265](#3265))
([23fef82](23fef82))
* 3342 faster touch runs and warning messages for non-existing files
([#3398](#3398))
([cd9c3c3](cd9c3c3))
* add default value to max-jobs-per-timespan
([#3043](#3043))
([2959abe](2959abe))
* checkpoints inside modules are overwritten
([#3359](#3359))
([fba3ac7](fba3ac7))
* Convert Path to IOFile
([#3405](#3405))
([c58684c](c58684c))
* Do not perform storage object cleanup with --keep-storage-local-copies
set ([#3358](#3358))
([9a6d14b](9a6d14b))
* edgecases of source deployment in case of remote execution
([#3396](#3396))
([5da13be](5da13be))
* enhance error message formatting for strict DAG-building mode
([#3376](#3376))
([a1c39ee](a1c39ee))
* fix bug in checkpoint handling that led to exceptions in case
checkpoint output was missing upon rerun
([#3423](#3423))
([8cf4a2f](8cf4a2f))
* force check all required outputs
([#3341](#3341))
([495a4e7](495a4e7))
* group job formatting
([#3442](#3442))
([f0b10a3](f0b10a3))
* in remote jobs, upload storage in topological order such that
modification dates are preserved (e.g. in case of group jobs)
([#3377](#3377))
([eace08f](eace08f))
* only skip eval when resource depends on input
([#3374](#3374))
([4574c92](4574c92))
* Prevent execution of conda in apptainer when not explicitly requested
in software deployment method
([#3388](#3388))
([c43c5c0](c43c5c0))
* print filenames with quotes around them in RuleException
([#3269](#3269))
([6baeda5](6baeda5))
* Re-evaluation of free resources
([#3399](#3399))
([6371293](6371293))
* ReadTheDocs layout issue due to src directory change
([#3419](#3419))
([695b127](695b127))
* robustly escaping quotes in generated bash scripts (v2)
([#3297](#3297))
([#3389](#3389))
([58720bd](58720bd))
* Show apptainer image URL in snakemake report
([#3407](#3407))
([45f0450](45f0450))
* Update ReadTheDocs configuration for documentation build to use Pixi
([#3433](#3433))
([3f227a6](3f227a6))


### Documentation

* Add pixi setup instructions to general use tutorial
([#3382](#3382))
([115e81b](115e81b))
* fix contribution section heading levels, fix docs testing setup order
([#3360](#3360))
([051dc53](051dc53))
* fix link to github.com/snakemake/poetry-snakemake-plugin
([#3436](#3436))
([ec6d97c](ec6d97c))
* fix quoting
([#3394](#3394))
([b40f599](b40f599))
* fix rerun-triggers default
([#3403](#3403))
([4430e23](4430e23))
* fix typo 'safe' -&gt; 'save'
([#3384](#3384))
([7755861](7755861))
* mention code formatting in the contribution section
([#3431](#3431))
([e8682b7](e8682b7))
* remove duplicated 'functions'.
([#3356](#3356))
([7c595db](7c595db))
* update broken links documentation
([#3437](#3437))
([e3d0d88](e3d0d88))
* Updating contributing guidelines with new pixi dev setup
([#3415](#3415))
([8e95a12](8e95a12))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: snakemake-bot <snakemake-bot-admin@googlegroups.com>
kjohnsen pushed a commit to kjohnsen/snakemake that referenced this pull request Dec 15, 2025
Addresses issue snakemake#3087.

Adds some functionality so that if a file is a storage object, it can
print a censored version of the query (`print_query`) rather than just
the query itself incase it contains sensitive information in the query
that should not be printed.

I suppose it would make sense to at least put something in the
documentation (in `snakemake-interface-storage-plugins` maybe?) to
explain that the `safe_print` method can be used (as an aside, I tested
this with the `_safe_to_print_url` method of
`snakemake-storage-plugin-xrootd` but figured that method name is
probably not general enough so I can rename it for that plugin).

Note this also slightly changes the message when printing an input file
as previously it would always print `(retrieve from storage)` even if
`retrieve=False`.

### QC
<!-- Make sure that you can tick the boxes below. -->

* [x] The PR contains a test case for the changes or the changes are
already covered by an existing test case.
* [x] The documentation (`docs/`) is updated to reflect the changes or
this is not necessary (e.g. if the change does neither modify the
language nor the behavior or functionalities of Snakemake).


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enabled configurable integration for xrootd-based storage, enhancing
file handling workflows.
- **Refactor**
- Improved messages and output clarity for storage operations, offering
more precise file processing instructions.
- **Tests**
- Added validations to confirm that sensitive credentials remain
concealed.
- **Chores**
- Updated continuous integration triggers to restrict execution based on
repository name.
	- Added a new dependency for enhanced xrootd storage capabilities.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Johannes Koester <johannes.koester@uni-due.de>
kjohnsen pushed a commit to kjohnsen/snakemake that referenced this pull request Dec 15, 2025
🤖 I have created a release *beep* *boop*
---


##
[9.0.0](snakemake/snakemake@v8.30.0...v9.0.0)
(2025-03-14)


### ⚠ BREAKING CHANGES

* Logging refactor & add LoggerPluginInterface
([snakemake#3107](snakemake#3107))

### Features

* [snakemake#3412](snakemake#3412) - keep
shadow folder of failed job if --keep-incomplete flag is set.
([snakemake#3430](snakemake#3430))
([22978c3](snakemake@22978c3))
* add flag --report-after-run to automatically generate the report after
a successfull workflow run
([snakemake#3428](snakemake#3428))
([b0a7f03](snakemake@b0a7f03))
* add flatten function to IO utils
([snakemake#3424](snakemake#3424))
([67fa392](snakemake@67fa392))
* add helper functions to parse input files
([snakemake#2918](snakemake#2918))
([63e45a7](snakemake@63e45a7))
* Add option to print redacted file names
([snakemake#3089](snakemake#3089))
([ba4d264](snakemake@ba4d264))
* add support for validation of polars dataframe and lazyframe
([snakemake#3262](snakemake#3262))
([c7473a6](snakemake@c7473a6))
* added support for rendering dag with mermaid js
([snakemake#3409](snakemake#3409))
([7bf8381](snakemake@7bf8381))
* adding --replace-workflow-config to fully replace workflow configs
(from config: directive) with --configfile, instead of merging them
([snakemake#3381](snakemake#3381))
([47504a0](snakemake@47504a0))
* Dynamic module name
([snakemake#3401](snakemake#3401))
([024dc32](snakemake@024dc32))
* Enable saving and reloading IOCache object
([snakemake#3386](snakemake#3386))
([c935953](snakemake@c935953))
* files added in rule params with workflow.source_path will be available
in used containers
([snakemake#3385](snakemake#3385))
([a6e45bf](snakemake@a6e45bf))
* Fix keep_local in storage directive and more freedom over remote
retrieval behaviour
([snakemake#3410](snakemake#3410))
([67b4739](snakemake@67b4739))
* inherit parameters of use rule and extend/replace individual items
them when using 'with' directive
([snakemake#3365](snakemake#3365))
([93e4b92](snakemake@93e4b92))
* Logging refactor & add LoggerPluginInterface
([snakemake#3107](snakemake#3107))
([86f1d6e](snakemake@86f1d6e))
* Maximal file size for checksums
([snakemake#3368](snakemake#3368))
([b039f8a](snakemake@b039f8a))
* Modernize package configuration using Pixi
([snakemake#3369](snakemake#3369))
([77992d8](snakemake@77992d8))
* multiext support for named input/output
([snakemake#3372](snakemake#3372))
([05e1378](snakemake@05e1378))
* optionally auto-group jobs via temp files in case of remote execution
([snakemake#3378](snakemake#3378))
([cc9bba2](snakemake@cc9bba2))


### Bug Fixes

* `--delete-all-output` ignores `--dry-run`
([snakemake#3265](snakemake#3265))
([23fef82](snakemake@23fef82))
* 3342 faster touch runs and warning messages for non-existing files
([snakemake#3398](snakemake#3398))
([cd9c3c3](snakemake@cd9c3c3))
* add default value to max-jobs-per-timespan
([snakemake#3043](snakemake#3043))
([2959abe](snakemake@2959abe))
* checkpoints inside modules are overwritten
([snakemake#3359](snakemake#3359))
([fba3ac7](snakemake@fba3ac7))
* Convert Path to IOFile
([snakemake#3405](snakemake#3405))
([c58684c](snakemake@c58684c))
* Do not perform storage object cleanup with --keep-storage-local-copies
set ([snakemake#3358](snakemake#3358))
([9a6d14b](snakemake@9a6d14b))
* edgecases of source deployment in case of remote execution
([snakemake#3396](snakemake#3396))
([5da13be](snakemake@5da13be))
* enhance error message formatting for strict DAG-building mode
([snakemake#3376](snakemake#3376))
([a1c39ee](snakemake@a1c39ee))
* fix bug in checkpoint handling that led to exceptions in case
checkpoint output was missing upon rerun
([snakemake#3423](snakemake#3423))
([8cf4a2f](snakemake@8cf4a2f))
* force check all required outputs
([snakemake#3341](snakemake#3341))
([495a4e7](snakemake@495a4e7))
* group job formatting
([snakemake#3442](snakemake#3442))
([f0b10a3](snakemake@f0b10a3))
* in remote jobs, upload storage in topological order such that
modification dates are preserved (e.g. in case of group jobs)
([snakemake#3377](snakemake#3377))
([eace08f](snakemake@eace08f))
* only skip eval when resource depends on input
([snakemake#3374](snakemake#3374))
([4574c92](snakemake@4574c92))
* Prevent execution of conda in apptainer when not explicitly requested
in software deployment method
([snakemake#3388](snakemake#3388))
([c43c5c0](snakemake@c43c5c0))
* print filenames with quotes around them in RuleException
([snakemake#3269](snakemake#3269))
([6baeda5](snakemake@6baeda5))
* Re-evaluation of free resources
([snakemake#3399](snakemake#3399))
([6371293](snakemake@6371293))
* ReadTheDocs layout issue due to src directory change
([snakemake#3419](snakemake#3419))
([695b127](snakemake@695b127))
* robustly escaping quotes in generated bash scripts (v2)
([snakemake#3297](snakemake#3297))
([snakemake#3389](snakemake#3389))
([58720bd](snakemake@58720bd))
* Show apptainer image URL in snakemake report
([snakemake#3407](snakemake#3407))
([45f0450](snakemake@45f0450))
* Update ReadTheDocs configuration for documentation build to use Pixi
([snakemake#3433](snakemake#3433))
([3f227a6](snakemake@3f227a6))


### Documentation

* Add pixi setup instructions to general use tutorial
([snakemake#3382](snakemake#3382))
([115e81b](snakemake@115e81b))
* fix contribution section heading levels, fix docs testing setup order
([snakemake#3360](snakemake#3360))
([051dc53](snakemake@051dc53))
* fix link to github.com/snakemake/poetry-snakemake-plugin
([snakemake#3436](snakemake#3436))
([ec6d97c](snakemake@ec6d97c))
* fix quoting
([snakemake#3394](snakemake#3394))
([b40f599](snakemake@b40f599))
* fix rerun-triggers default
([snakemake#3403](snakemake#3403))
([4430e23](snakemake@4430e23))
* fix typo 'safe' -&gt; 'save'
([snakemake#3384](snakemake#3384))
([7755861](snakemake@7755861))
* mention code formatting in the contribution section
([snakemake#3431](snakemake#3431))
([e8682b7](snakemake@e8682b7))
* remove duplicated 'functions'.
([snakemake#3356](snakemake#3356))
([7c595db](snakemake@7c595db))
* update broken links documentation
([snakemake#3437](snakemake#3437))
([e3d0d88](snakemake@e3d0d88))
* Updating contributing guidelines with new pixi dev setup
([snakemake#3415](snakemake#3415))
([8e95a12](snakemake@8e95a12))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: snakemake-bot <snakemake-bot-admin@googlegroups.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

2 participants