Skip to content

feat: optionally auto-group jobs via temp files in case of remote execution#3378

Merged
johanneskoester merged 12 commits intosnakemake:mainfrom
DrYak:feature_localtemp
Mar 14, 2025
Merged

feat: optionally auto-group jobs via temp files in case of remote execution#3378
johanneskoester merged 12 commits intosnakemake:mainfrom
DrYak:feature_localtemp

Conversation

@DrYak
Copy link
Copy Markdown
Contributor

@DrYak DrYak commented Mar 11, 2025

issue #3355

creates a new flag to flag files that are only accessible within a group running on a compute node, and note accessible to the main process (somewhat reminiscent as how pipes and services are grouped together but with regular files and directories on paths that are local to that machin, e.g., that machine's /tmp)

QC

  • The PR contains a test case for the changes or the changes are already covered by an existing test case.
  • The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

Summary by CodeRabbit

  • New Features

    • Introduced support for node-local storage workflows, enabling local handling of temporary and persistent files.
    • Enabled job grouping for tasks that leverage local storage to optimize performance.
  • Documentation

    • Added a new section detailing job-grouped temporary files to guide efficient usage of local storage.
  • Refactor

    • Improved error handling and file flag management to prevent conflicts and ensure smoother operations.

@DrYak DrYak requested a review from johanneskoester as a code owner March 11, 2025 12:34
@DrYak DrYak marked this pull request as draft March 11, 2025 12:35
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 11, 2025

📝 Walkthrough

Walkthrough

This pull request introduces a new Snakemake workflow specifically targeting nodelocal file processing. It adds a new Snakefile with four rules to generate, persist, and then consume temporary and persistent nodelocal files. Additionally, expected result files, test scripts, and shell scripts simulating node-local storage behavior using qsub are included. Internal Snakemake modules are modified to recognize and distinctly handle the "nodelocal" flag in file operations and flag validations. Documentation is updated to include details on job-grouping temporary files.

Changes

File(s) Change Summary
tests/test_nodelocal/Snakefile Added new workflow rules (all, create_nodelocal_temp, create_nodelocal_persist, consume_nodelocal) for processing files using nodelocal storage.
tests/test_nodelocal/expected-results/local/local
tests/test_nodelocal/expected-results/local/persist.txt
tests/test_nodelocal/expected-results/results/result.txt
Added expected results: a file with "local", a file persist.txt with "test", and updated result file with references to scratch/temp.txt and scratch/persist.txt.
tests/test_nodelocal/qsub
tests/test_nodelocal/qsub_stage2
Introduced new shell/Bash scripts to simulate node-local storage environments, including directory creation, namespace isolation, directory binding, logging, and execution of subsequent stages.
tests/tests.py Added new test function test_nodelocal (skipped on Windows) to validate that nodelocal temporary files and directories are handled correctly.
docs/snakefiles/rules.rst Added a new documentation section titled "Auto-grouping via temp files upon remote execution" explaining the usage and benefits of storing temporary files on local storage with group_jobs.
src/snakemake/dag.py Modified handle_storage and handle_pipes_and_services methods to account for files flagged as "nodelocal".
src/snakemake/io.py Updated the temp function to include a new group_jobs parameter and modified the protected function to raise an error if a file is flagged as "nodelocal".
src/snakemake/jobs.py Enhanced format_file and postprocess methods to handle files flagged as "nodelocal" during output file validation.
src/snakemake/rules.py Updated the _set_inoutput_item method to include the "nodelocal" flag in input flag validations.

Sequence Diagram(s)

Snakemake Workflow (nodelocal)

sequenceDiagram
    participant All as rule all
    participant Temp as rule create_nodelocal_temp
    participant Persist as rule create_nodelocal_persist
    participant Consume as rule consume_nodelocal

    All->>Temp: Trigger temp file creation (after 4-sec delay)
    All->>Persist: Trigger persistent file creation (after 4-sec delay)
    Temp-->>Consume: Provide scratch/temp.txt
    Persist-->>Consume: Provide scratch/persist.txt
    Consume-->>All: Generate results/result.txt
Loading

qsub Node-Local Simulation

sequenceDiagram
    participant Qsub as qsub script
    participant Stage2 as qsub_stage2 script
    participant Shell as Sub-shell / Process

    Qsub->>Qsub: Create directories "scratch" and "local"
    Qsub->>Stage2: Execute unshare with qsub_stage2 and pass arguments
    Stage2->>Stage2: Set mount namespace to private and bind "local" to "scratch"
    Stage2->>Stage2: Create marker file in "scratch" to validate binding
    Stage2->>Stage2: Log current date to qsub.log and append last line of input file
    Stage2->>Stage2: Generate and echo random job ID
    Stage2->>Shell: Execute subsequent shell command/script
Loading

Possibly related PRs


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 54d56b8 and 2baf2cd.

📒 Files selected for processing (2)
  • docs/snakefiles/rules.rst (1 hunks)
  • src/snakemake/dag.py (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • docs/snakefiles/rules.rst
  • src/snakemake/dag.py
⏰ Context from checks skipped due to timeout of 90000ms (40)
  • GitHub Check: tests (10, windows-latest, py312, bash)
  • GitHub Check: tests (10, windows-latest, py311, bash)
  • GitHub Check: tests (10, ubuntu-latest, py312, bash)
  • GitHub Check: tests (10, ubuntu-latest, py311, bash)
  • GitHub Check: tests (9, windows-latest, py312, bash)
  • GitHub Check: tests (9, windows-latest, py311, bash)
  • GitHub Check: tests (9, ubuntu-latest, py312, bash)
  • GitHub Check: tests (9, ubuntu-latest, py311, bash)
  • GitHub Check: tests (8, windows-latest, py312, bash)
  • GitHub Check: tests (8, windows-latest, py311, bash)
  • GitHub Check: tests (8, ubuntu-latest, py312, bash)
  • GitHub Check: tests (8, ubuntu-latest, py311, bash)
  • GitHub Check: tests (7, windows-latest, py312, bash)
  • GitHub Check: tests (7, windows-latest, py311, bash)
  • GitHub Check: tests (7, ubuntu-latest, py312, bash)
  • GitHub Check: tests (7, ubuntu-latest, py311, bash)
  • GitHub Check: tests (6, windows-latest, py312, bash)
  • GitHub Check: tests (6, windows-latest, py311, bash)
  • GitHub Check: tests (6, ubuntu-latest, py312, bash)
  • GitHub Check: tests (6, ubuntu-latest, py311, bash)
  • GitHub Check: tests (5, windows-latest, py312, bash)
  • GitHub Check: tests (5, windows-latest, py311, bash)
  • GitHub Check: tests (5, ubuntu-latest, py312, bash)
  • GitHub Check: tests (5, ubuntu-latest, py311, bash)
  • GitHub Check: tests (4, windows-latest, py312, bash)
  • GitHub Check: tests (4, windows-latest, py311, bash)
  • GitHub Check: tests (4, ubuntu-latest, py312, bash)
  • GitHub Check: tests (4, ubuntu-latest, py311, bash)
  • GitHub Check: tests (3, windows-latest, py312, bash)
  • GitHub Check: tests (3, windows-latest, py311, bash)
  • GitHub Check: tests (3, ubuntu-latest, py312, bash)
  • GitHub Check: tests (3, ubuntu-latest, py311, bash)
  • GitHub Check: tests (2, windows-latest, py312, bash)
  • GitHub Check: tests (2, windows-latest, py311, bash)
  • GitHub Check: tests (2, ubuntu-latest, py312, bash)
  • GitHub Check: tests (2, ubuntu-latest, py311, bash)
  • GitHub Check: tests (1, windows-latest, py312, bash)
  • GitHub Check: tests (1, windows-latest, py311, bash)
  • GitHub Check: tests (1, ubuntu-latest, py312, bash)
  • GitHub Check: tests (1, ubuntu-latest, py311, bash)

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@DrYak DrYak changed the title [DRAFT] adding nodelocal feat: nodelocal flag for files only accessible e.g. in the compute node of an HPC running a group. Mar 11, 2025
@DrYak DrYak force-pushed the feature_localtemp branch 5 times, most recently from 5ead617 to 08852d1 Compare March 11, 2025 17:45
@DrYak
Copy link
Copy Markdown
Contributor Author

DrYak commented Mar 11, 2025

Test added:

  • successfully simulates a node that has a local-only "./scratch" directory that is not visible to the caller (it is seen as "./local") using linux-utils' unshare (I hope it works on the CI too?)
  • successfully replicates the trouble I have on the cluster (snakemake tries to schedule multiple jobs together even if they should be ran in sequencially)
  • if manually flagging the files as "temp" and manually creating user groups (instead of nodelocal, see patch attached: manual_temp_group.patch.txt) it kind of works because main job of snakemake 8 realises that temp files aren't required in other groups/rules.
  • added an extra new rule to also test nodelocal not marked as temp (not marked for deletion, should still be ignored by the system that looks for missing files, but should be visible in ./local)

@DrYak DrYak force-pushed the feature_localtemp branch from 0cd9589 to 7aca2e1 Compare March 12, 2025 08:51
@johanneskoester johanneskoester marked this pull request as ready for review March 12, 2025 13:44
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
snakemake/dag.py (1)

1856-1867: Consider simplifying user group assignment with ternary operator.

This code sets up a user group for nodelocal files correctly, but could be slightly simplified.

-                if user_groups:
-                    ugroup = user_groups.pop()
-                else:
-                    ugroup = str(uuid.uuid4())
+                ugroup = user_groups.pop() if user_groups else str(uuid.uuid4())
🧰 Tools
🪛 Ruff (0.8.2)

1858-1861: Use ternary operator ugroup = user_groups.pop() if user_groups else str(uuid.uuid4()) instead of if-else-block

Replace if-else-block with ugroup = user_groups.pop() if user_groups else str(uuid.uuid4())

(SIM108)

tests/test_nodelocal/qsub_stage2 (1)

1-8: Effective test script for node-local functionality simulation.

This script complements the qsub script by setting up the bind mount between directories and simulating job scheduler behavior. The approach of making the 'local' directory appear as 'scratch' within the namespace effectively replicates node-local storage behavior.

Consider adding more detailed comments explaining:

  1. The purpose of the mount --make-rprivate --bind command and how it simulates node-local storage
  2. Why the random number generation is used (to simulate job ID generation from schedulers)
  3. The overall test process flow
 #!/bin/bash
+# Make the mount namespace private and bind 'local' to appear as 'scratch' within the namespace
+# This simulates how node-local storage works - files accessible at one path within the compute node
 mount --make-rprivate --bind local scratch
+# Create a marker file to verify the binding works
 echo "local" > scratch/local
+# Record execution time for debugging
 echo `date` >> qsub.log
+# Log the script being executed
 tail -n1 $1 >> qsub.log
-# simulate printing of job id by a random number
+# Simulate job scheduler behavior - print a job ID (used by Snakemake to track the job)
 echo $RANDOM
+# Execute the actual job script provided as first argument
 sh $1
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6baeda5 and 6af5d12.

📒 Files selected for processing (12)
  • snakemake/dag.py (7 hunks)
  • snakemake/io.py (2 hunks)
  • snakemake/jobs.py (2 hunks)
  • snakemake/rules.py (1 hunks)
  • snakemake/workflow.py (1 hunks)
  • tests/test_nodelocal/Snakefile (1 hunks)
  • tests/test_nodelocal/expected-results/local/local (1 hunks)
  • tests/test_nodelocal/expected-results/local/persist.txt (1 hunks)
  • tests/test_nodelocal/expected-results/results/result.txt (1 hunks)
  • tests/test_nodelocal/qsub (1 hunks)
  • tests/test_nodelocal/qsub_stage2 (1 hunks)
  • tests/tests.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • snakemake/workflow.py
  • snakemake/jobs.py
  • tests/tests.py
  • snakemake/rules.py
  • snakemake/io.py
  • snakemake/dag.py
🪛 Ruff (0.8.2)
snakemake/workflow.py

83-83: snakemake.io.nodelocal imported but unused

Remove unused import

(F401)

snakemake/dag.py

1858-1861: Use ternary operator ugroup = user_groups.pop() if user_groups else str(uuid.uuid4()) instead of if-else-block

Replace if-else-block with ugroup = user_groups.pop() if user_groups else str(uuid.uuid4())

(SIM108)

🔇 Additional comments (21)
snakemake/jobs.py (2)

71-72: Appropriate implementation for nodelocal flag formatting.

The added code correctly follows the established pattern of formatting special file types by appending a description of the flag type.


1591-1592: Good enhancement to handle nodelocal files during post-processing.

This modification ensures that output files flagged as "nodelocal" are properly ignored when checking for missing outputs. This is essential for preventing false error reports when the files exist only on specific compute nodes and aren't visible to other processes.

The condition has been correctly updated to first check for the nodelocal flag before falling back to the temp file check, providing appropriate prioritization.

snakemake/dag.py (6)

924-924: Good addition: Preventing nodelocal files from being stored in remote storage.

Adding the check for "nodelocal" flag ensures that files meant to be local to a node are not inadvertently stored in remote storage, which aligns well with the intended purpose of this feature.


1757-1778: Implementation of nodelocal flag detection and setup looks correct.

The addition of the has_nodelocal variable and corresponding checks properly integrates with the existing pipe and service detection system. The handling of nodelocal files correctly implements the intended behavior of making these files visible only within a specific node.


1791-1799: Appropriate special handling for nodelocal files that have no consumers.

Unlike pipes or services which require consumers to avoid deadlocks, nodelocal files are allowed to have no consumers, which is handled correctly here by skipping the error check when is_nodelocal is true.


1806-1816: Correctly exempting nodelocal files from the 'run' directive restriction.

You've appropriately allowed nodelocal files to be used with the 'run' directive, unlike pipes and services which require 'shell' or 'script' directives. This makes sense since nodelocal files don't have the same execution constraints as pipes/services.


1823-1834: Good check for conflicting user groups.

The code correctly ensures that nodelocal files (like pipes and services) don't have conflicting user groups, which would lead to inconsistent execution behavior.


2622-2624: Good addition to status reporting.

Adding a dedicated status for nodelocal files in the summary output provides useful information to users about these specially handled files.

tests/test_nodelocal/expected-results/local/persist.txt (1)

1-2: Test file looks appropriate for nodelocal feature testing.

This simple test file contains the expected content for validating the nodelocal file persistence functionality. It should correctly match the output of the create_nodelocal_persist rule mentioned in the PR summary.

tests/test_nodelocal/expected-results/local/local (1)

1-2: Test file correctly matches expected structure.

This file contains the simple content "local" which seems appropriate for validating the correct handling of nodelocal file functionality in the test environment.

tests/test_nodelocal/expected-results/results/result.txt (1)

1-3: Expected test results file correctly structured.

This file contains references to both temporary and persistent nodelocal files, which aligns with the test case described in the PR summary. It seems to represent the expected output of a rule that consumes both nodelocal files.

snakemake/workflow.py (1)

83-83: Appropriate import addition for new nodelocal flag feature.

Adding the nodelocal import to the list of I/O-related functions makes this new flag available to users writing Snakefiles. The import follows the established pattern for exposing new file flags in the Snakemake global namespace.

🧰 Tools
🪛 Ruff (0.8.2)

83-83: snakemake.io.nodelocal imported but unused

Remove unused import

(F401)

tests/test_nodelocal/qsub (1)

1-4: Well-designed test script for simulating node-local environment.

This script effectively simulates a node-local environment by creating two directories and using Linux namespace isolation with unshare. This approach is appropriate for testing the nodelocal flag functionality without requiring actual HPC hardware.

Note that this approach relies on Linux namespace capabilities which might be restricted in some CI environments. Verify that your CI pipeline supports these operations.

snakemake/rules.py (1)

452-452: Proper validation for nodelocal flag usage.

Adding the nodelocal flag to the list of output-only flags ensures users receive appropriate warnings if they incorrectly attempt to use it with input files. This change correctly integrates the new feature with Snakemake's existing validation system.

tests/tests.py (1)

2056-2069: Well-implemented test for the new nodelocal functionality.

This test properly verifies that files flagged as "nodelocal" behave as expected - either the temporary file doesn't exist after execution or the scratch directory is empty. The test appropriately uses the @skip_on_windows decorator since the feature is likely using Linux-specific functionality.

tests/test_nodelocal/Snakefile (4)

1-4: LGTM! Clear rule definition.

The "all" rule correctly establishes the dependency graph for the test.


6-14: Good implementation of the temporary nodelocal file creation.

This rule demonstrates the primary use case of nodelocal functionality with the default temporary behavior. The sleep command helps simulate realistic execution timing.


16-24: Good implementation of the persistent nodelocal file.

This rule demonstrates the secondary use case with mktemp=False to create a persistent nodelocal file, providing good test coverage for both behaviors.


26-36: Clear consumption of nodelocal files.

This rule correctly demonstrates how nodelocal files can be consumed by other rules using the paths relative to the consuming rule, not the nodelocal paths seen by the creating rules.

snakemake/io.py (2)

1141-1155: Well-implemented nodelocal function with appropriate error handling.

The function correctly implements the nodelocal flag functionality with good documentation explaining its purpose. It properly handles validation against incompatible flags (protected, storage_object) and includes the option to make files temporary by default, which aligns with the common use case for node-local files.


1233-1234: Good update to protected() function for mutual exclusivity with nodelocal.

This change correctly enforces that protected and nodelocal flags cannot be used together, maintaining integrity of the file flagging system.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
snakemake/dag.py (1)

1856-1868: Good implementation of user group assignment for nodelocal files.

The nodelocal implementation correctly creates user groups (not pipe groups) to ensure sequential execution of jobs with nodelocal files and their dependencies.

The comment about not creating user groups in the node's local instance is helpful for understanding potential resource merging issues.

Consider using a ternary operator for cleaner code:

-                if user_groups:
-                    ugroup = user_groups.pop()
-                else:
-                    ugroup = str(uuid.uuid4())
+                ugroup = user_groups.pop() if user_groups else str(uuid.uuid4())
🧰 Tools
🪛 Ruff (0.8.2)

1859-1862: Use ternary operator ugroup = user_groups.pop() if user_groups else str(uuid.uuid4()) instead of if-else-block

Replace if-else-block with ugroup = user_groups.pop() if user_groups else str(uuid.uuid4())

(SIM108)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6af5d12 and 1c8c5cd.

📒 Files selected for processing (1)
  • snakemake/dag.py (7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • snakemake/dag.py
🪛 Ruff (0.8.2)
snakemake/dag.py

1859-1862: Use ternary operator ugroup = user_groups.pop() if user_groups else str(uuid.uuid4()) instead of if-else-block

Replace if-else-block with ugroup = user_groups.pop() if user_groups else str(uuid.uuid4())

(SIM108)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: testing-windows (10)
🔇 Additional comments (6)
snakemake/dag.py (6)

924-924: Nice addition to prevent storing nodelocal files.

This change correctly excludes files flagged as "nodelocal" from being stored in remote storage, which is essential since these files are only accessible on the compute node.


1758-1777: Good implementation of nodelocal flag detection.

The implementation properly tracks both if any output files are nodelocal (has_nodelocal) and if a specific file is nodelocal (is_nodelocal). The conditional logic preserves the original behavior for pipe and service files while adding support for nodelocal files.


1791-1793: Appropriate exemption for nodelocal files.

Unlike pipe and service files, nodelocal files don't need consumers as they're only locally accessible on the compute node. This exception is correctly implemented.


1807-1816: Correct exemption for run directive restriction.

This change appropriately exempts nodelocal files from the restriction on using 'run' directives, which only applies to pipe and service files.


1836-1855: Well-structured conditional for pipe group handling.

The original pipe group logic has been properly wrapped in a conditional to ensure it only runs for pipe or service files, not for nodelocal files.


2623-2625: Appropriate status display for nodelocal files.

This change updates the summary method to show "file local to node" status for nodelocal files that don't exist, improving visibility of these special files in the workflow summary.

@DrYak DrYak force-pushed the feature_localtemp branch from 1c8c5cd to 4500a03 Compare March 12, 2025 14:34
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
snakemake/dag.py (1)

1758-1869: LGTM: Added nodelocal file handling logic.

The implementation correctly handles nodelocal files by:

  1. Detecting files flagged as "nodelocal"
  2. Allowing nodelocal files to have zero consumers (unlike pipes/services)
  3. Placing nodelocal jobs and their dependencies in the same user group for serial execution
  4. Appropriately skipping certain error checks for nodelocal files

This implementation ensures that nodelocal files are treated similarly to pipes and services but with appropriate differences for their specific use case (files local to a compute node).

Consider using a ternary operator for more concise code:

-                if user_groups:
-                    ugroup = user_groups.pop()
-                else:
-                    ugroup = str(uuid.uuid4())
+                ugroup = user_groups.pop() if user_groups else str(uuid.uuid4())
🧰 Tools
🪛 Ruff (0.8.2)

1859-1862: Use ternary operator ugroup = user_groups.pop() if user_groups else str(uuid.uuid4()) instead of if-else-block

Replace if-else-block with ugroup = user_groups.pop() if user_groups else str(uuid.uuid4())

(SIM108)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c8c5cd and 4500a03.

📒 Files selected for processing (12)
  • snakemake/dag.py (7 hunks)
  • snakemake/io.py (2 hunks)
  • snakemake/jobs.py (2 hunks)
  • snakemake/rules.py (1 hunks)
  • snakemake/workflow.py (1 hunks)
  • tests/test_nodelocal/Snakefile (1 hunks)
  • tests/test_nodelocal/expected-results/local/local (1 hunks)
  • tests/test_nodelocal/expected-results/local/persist.txt (1 hunks)
  • tests/test_nodelocal/expected-results/results/result.txt (1 hunks)
  • tests/test_nodelocal/qsub (1 hunks)
  • tests/test_nodelocal/qsub_stage2 (1 hunks)
  • tests/tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • tests/test_nodelocal/expected-results/local/persist.txt
  • tests/test_nodelocal/expected-results/results/result.txt
  • tests/test_nodelocal/expected-results/local/local
  • tests/test_nodelocal/qsub
  • tests/test_nodelocal/qsub_stage2
  • tests/tests.py
  • snakemake/rules.py
  • snakemake/jobs.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • snakemake/dag.py
  • snakemake/io.py
  • snakemake/workflow.py
🪛 Ruff (0.8.2)
snakemake/dag.py

1859-1862: Use ternary operator ugroup = user_groups.pop() if user_groups else str(uuid.uuid4()) instead of if-else-block

Replace if-else-block with ugroup = user_groups.pop() if user_groups else str(uuid.uuid4())

(SIM108)

snakemake/workflow.py

83-83: snakemake.io.nodelocal imported but unused

Remove unused import

(F401)

⏰ Context from checks skipped due to timeout of 90000ms (30)
  • GitHub Check: testing (10, 3.12, bash)
  • GitHub Check: testing (10, 3.11, bash)
  • GitHub Check: testing (9, 3.12, bash)
  • GitHub Check: testing (9, 3.11, bash)
  • GitHub Check: testing (8, 3.12, bash)
  • GitHub Check: testing (8, 3.11, bash)
  • GitHub Check: testing (7, 3.12, bash)
  • GitHub Check: testing (7, 3.11, bash)
  • GitHub Check: testing (6, 3.12, bash)
  • GitHub Check: testing-windows (10)
  • GitHub Check: testing (6, 3.11, bash)
  • GitHub Check: testing-windows (9)
  • GitHub Check: testing (5, 3.12, bash)
  • GitHub Check: testing-windows (8)
  • GitHub Check: testing (5, 3.11, bash)
  • GitHub Check: testing-windows (7)
  • GitHub Check: testing (4, 3.12, bash)
  • GitHub Check: testing-windows (6)
  • GitHub Check: testing (4, 3.11, bash)
  • GitHub Check: testing-windows (5)
  • GitHub Check: testing (3, 3.12, bash)
  • GitHub Check: testing-windows (4)
  • GitHub Check: testing (3, 3.11, bash)
  • GitHub Check: testing-windows (3)
  • GitHub Check: testing (2, 3.12, bash)
  • GitHub Check: testing (2, 3.11, bash)
  • GitHub Check: testing-windows (2)
  • GitHub Check: testing (1, 3.12, bash)
  • GitHub Check: testing-windows (1)
  • GitHub Check: testing (1, 3.11, bash)
🔇 Additional comments (9)
snakemake/workflow.py (1)

83-83: Import addition for nodelocal flag looks good.

The addition of nodelocal to the imports from snakemake.io will make this function available to users in their Snakefiles, similar to other I/O flag functions like temp, protected, etc.

🧰 Tools
🪛 Ruff (0.8.2)

83-83: snakemake.io.nodelocal imported but unused

Remove unused import

(F401)

tests/test_nodelocal/Snakefile (4)

1-4: Well-defined target rule.

The all rule correctly defines the workflow's final target as the result file that will contain information about the nodelocal files.


6-13: Good implementation of temporary nodelocal file creation.

This rule creates a temporary nodelocal file in the scratch directory. The 4-second sleep mimics a longer running process, which is useful for testing potential race conditions in the scheduler when dealing with nodelocal files.


16-23: Good implementation of persistent nodelocal file creation.

This rule demonstrates the usage of mktemp=False parameter to create a persistent nodelocal file (not automatically deleted). This tests an important feature variation of the nodelocal flag.


26-36: Effective consumption of nodelocal files.

This rule properly demonstrates how nodelocal files are consumed from another rule. Using ls to check file existence is a simple and effective way to verify that the files are accessible from the job.

snakemake/io.py (2)

1141-1154: Well-implemented nodelocal flag function.

The nodelocal function properly implements a flag for files that exist only on the compute node executing group jobs. It follows the established pattern of flag functions in the codebase:

  1. Performs appropriate validation checks against conflicting flags (protected, storage_object)
  2. Provides informative SyntaxError messages when incompatible flags are detected
  3. Has a sensible default (mktemp=True) that also marks files as temporary unless specified otherwise

The code includes helpful comments explaining when nodelocal might be used with pipes or directories.


1233-1234: Appropriate mutual exclusivity check for protected and nodelocal flags.

The update to the protected function correctly checks for and prevents using both protected and nodelocal flags together, which would be contradictory since nodelocal files are not accessible from the main process and thus cannot be reliably protected.

snakemake/dag.py (2)

923-925: LGTM: Added nodelocal flag handling for storage exclusion.

This change correctly prevents files flagged as "nodelocal" from being stored in storage, which aligns with the goal of having files that are only accessible on the compute node.


2623-2624: LGTM: Added nodelocal status in summary report.

This change appropriately updates the status reporting for nodelocal files in the summary method, providing clear information to users about files that exist only on compute nodes.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 12, 2025

Caution

An unexpected error occurred while opening a pull request: Not Found - https://docs.github.com/rest/git/refs#get-a-reference

@DrYak DrYak force-pushed the feature_localtemp branch from 4500a03 to 2f9cca3 Compare March 12, 2025 15:05
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
tests/test_nodelocal/qsub_stage2 (1)

1-21: 🛠️ Refactor suggestion

Add cleanup and validate prerequisites.

The script creates a bind mount but doesn't clean it up when done, which could affect subsequent runs. Also, it doesn't verify if the required directories exist before starting.

Consider adding this at the beginning of the script:

# Ensure the required directories exist
if [ ! -d "local" ] || [ ! -d "scratch" ]; then
  echo "Error: 'local' and/or 'scratch' directories don't exist" >&2
  exit 1
fi

And at the end:

# Cleanup
trap 'umount scratch 2>/dev/null || true' EXIT

This helps ensure the script works as expected and cleans up after itself.

🧹 Nitpick comments (2)
tests/test_nodelocal/qsub_stage2 (2)

16-20: Quote variables and use modern command substitution.

The script has two minor issues:

  1. The date command uses backticks (older syntax) instead of the modern $() substitution
  2. $1 is not quoted, which could cause issues if the argument contains spaces
-echo `date` >> qsub.log
-tail -n1 $1 >> qsub.log
+echo "$(date)" >> qsub.log
+tail -n1 "$1" >> qsub.log
 # simulate printing of job id by a random number
 echo $RANDOM
-sh $1
+sh "$1"

1-21: Add detailed header documentation.

While the script has good inline comments, it would benefit from a comprehensive header explaining its purpose, usage, and relationship to the nodelocal feature being tested.

Consider adding this at the top of the script:

#!/bin/bash
#
# qsub_stage2 - Second stage of nodelocal test simulation 
#
# This script simulates a compute node environment by creating a private mount
# namespace where 'local' directory is bound to 'scratch' to replicate the scenario
# of node-local storage in HPC environments. This is part of testing the 'nodelocal'
# flag functionality in Snakemake which addresses files that are only accessible
# within specific compute nodes but not visible to the main process.
#
# Usage: ./qsub_stage2 <shell_script_to_execute>
#
# Note: This script requires root privileges for mount operations or needs to be
# executed with unshare --mount (as done in the qsub wrapper script).
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4500a03 and 2f9cca3.

📒 Files selected for processing (8)
  • snakemake/dag.py (7 hunks)
  • tests/test_nodelocal/Snakefile (1 hunks)
  • tests/test_nodelocal/expected-results/local/local (1 hunks)
  • tests/test_nodelocal/expected-results/local/persist.txt (1 hunks)
  • tests/test_nodelocal/expected-results/results/result.txt (1 hunks)
  • tests/test_nodelocal/qsub (1 hunks)
  • tests/test_nodelocal/qsub_stage2 (1 hunks)
  • tests/tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
  • tests/test_nodelocal/expected-results/local/persist.txt
  • tests/test_nodelocal/expected-results/results/result.txt
  • tests/test_nodelocal/expected-results/local/local
  • tests/test_nodelocal/qsub
  • tests/test_nodelocal/Snakefile
  • tests/tests.py
  • snakemake/dag.py
⏰ Context from checks skipped due to timeout of 90000ms (30)
  • GitHub Check: testing (10, 3.12, bash)
  • GitHub Check: testing (10, 3.11, bash)
  • GitHub Check: testing (9, 3.12, bash)
  • GitHub Check: testing (9, 3.11, bash)
  • GitHub Check: testing (8, 3.12, bash)
  • GitHub Check: testing (8, 3.11, bash)
  • GitHub Check: testing (7, 3.12, bash)
  • GitHub Check: testing (7, 3.11, bash)
  • GitHub Check: testing (6, 3.12, bash)
  • GitHub Check: testing (6, 3.11, bash)
  • GitHub Check: testing (5, 3.12, bash)
  • GitHub Check: testing-windows (10)
  • GitHub Check: testing (5, 3.11, bash)
  • GitHub Check: testing-windows (9)
  • GitHub Check: testing (4, 3.12, bash)
  • GitHub Check: testing-windows (8)
  • GitHub Check: testing (4, 3.11, bash)
  • GitHub Check: testing-windows (7)
  • GitHub Check: testing (3, 3.12, bash)
  • GitHub Check: testing-windows (6)
  • GitHub Check: testing (3, 3.11, bash)
  • GitHub Check: testing-windows (5)
  • GitHub Check: testing (2, 3.12, bash)
  • GitHub Check: testing-windows (4)
  • GitHub Check: testing (2, 3.11, bash)
  • GitHub Check: testing-windows (3)
  • GitHub Check: testing (1, 3.12, bash)
  • GitHub Check: testing-windows (2)
  • GitHub Check: testing (1, 3.11, bash)
  • GitHub Check: testing-windows (1)

# - NOT accessible on other nodes nor on the login node
#
# Make the mount namespace private and bind 'local' to appear as 'scratch' within the namespace
mount --make-rprivate --bind local scratch
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for mount command.

The mount command might fail due to insufficient permissions or if directories don't exist, but the script continues regardless of the outcome.

-mount --make-rprivate --bind local scratch
+if ! mount --make-rprivate --bind local scratch; then
+  echo "Error: Failed to bind mount local to scratch. Ensure both directories exist and you have sufficient permissions." >&2
+  exit 1
+fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
mount --make-rprivate --bind local scratch
if ! mount --make-rprivate --bind local scratch; then
echo "Error: Failed to bind mount local to scratch. Ensure both directories exist and you have sufficient permissions." >&2
exit 1
fi

@DrYak
Copy link
Copy Markdown
Contributor Author

DrYak commented Mar 12, 2025

@johanneskoester it works, Peter managed to pin-point the trouble source and help fix

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docs/snakefiles/rules.rst (2)

1800-1804: Improve Grammar and Clarity in the Node-local Files Description

  • In the sentence beginning at line 1802, consider revising “attached locally on the a cluster compute node” to “attached locally on a compute node” for clarity.
  • Update “Snakemake (since version 8.31.0) allow files” to “Snakemake (since version 8.31.0) allows files” to correct the subject–verb agreement.
  • Consider adding a brief note that by default files flagged as nodelocal are automatically treated as temporary (i.e. marked as temp unless mktemp is set to False) and explain the execution benefit (i.e. scheduling rules that use them on the same physical node).

1807-1824: Enhance the Code Block Example for Node-local Files

  • The example clearly demonstrates how to use the new nodelocal flag. For further clarity, consider inserting short inline comments. For example, in rule NAME1, you might add a comment indicating that this rule creates an intermediate file on the node-local storage, and in rule NAME2 that the file is then consumed to produce the final output.
  • This additional annotation may help users quickly grasp the purpose and benefits of marking outputs with nodelocal.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f9cca3 and 2c159c0.

📒 Files selected for processing (1)
  • docs/snakefiles/rules.rst (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (29)
  • GitHub Check: testing (10, 3.12, bash)
  • GitHub Check: testing (10, 3.11, bash)
  • GitHub Check: testing (9, 3.12, bash)
  • GitHub Check: testing (9, 3.11, bash)
  • GitHub Check: testing (8, 3.12, bash)
  • GitHub Check: testing (8, 3.11, bash)
  • GitHub Check: testing (7, 3.12, bash)
  • GitHub Check: testing (7, 3.11, bash)
  • GitHub Check: testing (6, 3.12, bash)
  • GitHub Check: testing (6, 3.11, bash)
  • GitHub Check: testing (5, 3.12, bash)
  • GitHub Check: testing (5, 3.11, bash)
  • GitHub Check: testing-windows (10)
  • GitHub Check: testing (4, 3.12, bash)
  • GitHub Check: testing-windows (9)
  • GitHub Check: testing (4, 3.11, bash)
  • GitHub Check: testing-windows (8)
  • GitHub Check: testing (3, 3.12, bash)
  • GitHub Check: testing-windows (7)
  • GitHub Check: testing (3, 3.11, bash)
  • GitHub Check: testing-windows (6)
  • GitHub Check: testing (2, 3.12, bash)
  • GitHub Check: testing (2, 3.11, bash)
  • GitHub Check: testing-windows (4)
  • GitHub Check: testing (1, 3.12, bash)
  • GitHub Check: testing-windows (3)
  • GitHub Check: testing (1, 3.11, bash)
  • GitHub Check: testing-windows (2)
  • GitHub Check: testing-windows (1)

@DrYak DrYak force-pushed the feature_localtemp branch 3 times, most recently from 4e01f14 to a462c5b Compare March 12, 2025 16:47
@sonarqubecloud
Copy link
Copy Markdown

DrYak added 6 commits March 14, 2025 10:53
- rename flag option for clarity
- make docstring clearer regarding option
- update documentation about option and upcoming version
- adapt tests
- expose functionality as a option on "temp()"
- adapt documentation and tests
@DrYak DrYak force-pushed the feature_localtemp branch from 7806807 to 3f7132d Compare March 14, 2025 09:54
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
tests/test_nodelocal/qsub_stage2 (3)

17-17: Add parameter validation for the script input.

The script uses $1 without checking if it exists or is valid, which could lead to errors if the script is called without arguments.

+if [ -z "$1" ] || [ ! -f "$1" ]; then
+  echo "Error: Please provide a valid file as the first argument." >&2
+  exit 1
+fi
 tail -n1 $1 >> qsub.log

20-21: Add cleanup for mount binding before exiting.

The script creates a mount binding but doesn't clean it up before exiting, which could lead to orphaned mount points.

 echo $RANDOM
-sh $1
+# Execute the script and capture its exit code
+sh $1
+exit_code=$?
+
+# Clean up the mount binding
+umount scratch || echo "Warning: Failed to unmount scratch directory" >&2
+
+exit $exit_code

1-21: Add a trap for cleanup on script exit.

To ensure resources are properly cleaned up even if the script terminates unexpectedly, consider adding a trap handler.

 #!/bin/bash
 
+# Function to clean up resources
+cleanup() {
+  umount scratch 2>/dev/null || true
+  echo "Cleanup completed" >&2
+}
+
+# Set trap to call cleanup function on EXIT
+trap cleanup EXIT
+
 # Simulates node-local storage (a.k.a. "local scratch"):
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7806807 and 3f7132d.

📒 Files selected for processing (12)
  • docs/snakefiles/rules.rst (1 hunks)
  • src/snakemake/dag.py (7 hunks)
  • src/snakemake/io.py (2 hunks)
  • src/snakemake/jobs.py (2 hunks)
  • src/snakemake/rules.py (1 hunks)
  • tests/test_nodelocal/Snakefile (1 hunks)
  • tests/test_nodelocal/expected-results/local/local (1 hunks)
  • tests/test_nodelocal/expected-results/local/persist.txt (1 hunks)
  • tests/test_nodelocal/expected-results/results/result.txt (1 hunks)
  • tests/test_nodelocal/qsub (1 hunks)
  • tests/test_nodelocal/qsub_stage2 (1 hunks)
  • tests/tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • tests/test_nodelocal/expected-results/local/local
  • tests/test_nodelocal/expected-results/results/result.txt
  • tests/test_nodelocal/expected-results/local/persist.txt
  • tests/test_nodelocal/qsub
  • tests/tests.py
  • tests/test_nodelocal/Snakefile
  • src/snakemake/rules.py
  • docs/snakefiles/rules.rst
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • src/snakemake/jobs.py
  • src/snakemake/io.py
  • src/snakemake/dag.py
⏰ Context from checks skipped due to timeout of 90000ms (41)
  • GitHub Check: tests (10, windows-latest, py312, bash)
  • GitHub Check: tests (10, windows-latest, py311, bash)
  • GitHub Check: tests (10, ubuntu-latest, py312, bash)
  • GitHub Check: tests (10, ubuntu-latest, py311, bash)
  • GitHub Check: tests (9, windows-latest, py312, bash)
  • GitHub Check: tests (9, windows-latest, py311, bash)
  • GitHub Check: tests (9, ubuntu-latest, py312, bash)
  • GitHub Check: tests (9, ubuntu-latest, py311, bash)
  • GitHub Check: tests (8, windows-latest, py312, bash)
  • GitHub Check: tests (8, windows-latest, py311, bash)
  • GitHub Check: tests (8, ubuntu-latest, py312, bash)
  • GitHub Check: tests (8, ubuntu-latest, py311, bash)
  • GitHub Check: tests (7, windows-latest, py312, bash)
  • GitHub Check: tests (7, windows-latest, py311, bash)
  • GitHub Check: tests (7, ubuntu-latest, py312, bash)
  • GitHub Check: tests (7, ubuntu-latest, py311, bash)
  • GitHub Check: tests (6, windows-latest, py312, bash)
  • GitHub Check: tests (6, windows-latest, py311, bash)
  • GitHub Check: tests (6, ubuntu-latest, py312, bash)
  • GitHub Check: tests (6, ubuntu-latest, py311, bash)
  • GitHub Check: tests (5, windows-latest, py312, bash)
  • GitHub Check: tests (5, windows-latest, py311, bash)
  • GitHub Check: tests (5, ubuntu-latest, py312, bash)
  • GitHub Check: tests (5, ubuntu-latest, py311, bash)
  • GitHub Check: tests (4, windows-latest, py312, bash)
  • GitHub Check: tests (4, windows-latest, py311, bash)
  • GitHub Check: tests (4, ubuntu-latest, py312, bash)
  • GitHub Check: tests (4, ubuntu-latest, py311, bash)
  • GitHub Check: tests (3, windows-latest, py312, bash)
  • GitHub Check: tests (3, windows-latest, py311, bash)
  • GitHub Check: tests (3, ubuntu-latest, py312, bash)
  • GitHub Check: tests (3, ubuntu-latest, py311, bash)
  • GitHub Check: tests (2, windows-latest, py312, bash)
  • GitHub Check: tests (2, windows-latest, py311, bash)
  • GitHub Check: tests (2, ubuntu-latest, py312, bash)
  • GitHub Check: tests (2, ubuntu-latest, py311, bash)
  • GitHub Check: tests (1, windows-latest, py312, bash)
  • GitHub Check: tests (1, windows-latest, py311, bash)
  • GitHub Check: tests (1, ubuntu-latest, py312, bash)
  • GitHub Check: tests (1, ubuntu-latest, py311, bash)
  • GitHub Check: apidocs
🔇 Additional comments (13)
tests/test_nodelocal/qsub_stage2 (1)

8-8: Add error handling for mount command.

The mount command might fail due to insufficient permissions or if directories don't exist, but the script continues regardless of the outcome.

-mount --make-rprivate --bind local scratch
+if ! mount --make-rprivate --bind local scratch; then
+  echo "Error: Failed to bind mount local to scratch. Ensure both directories exist and you have sufficient permissions." >&2
+  exit 1
+fi
src/snakemake/jobs.py (2)

73-74: Great addition to handle nodelocal files.

This change properly extends the format_file function to handle files flagged as "nodelocal", following the same pattern used for other file types like pipes and services.


1622-1623: Correctly ignores nodelocal files in output validation.

This change makes perfect sense - nodelocal files (which only exist on compute nodes) should be ignored when checking for missing output after a job completes, as they won't be visible to the main Snakemake process. This change allows jobs with nodelocal outputs to be properly marked as successful even when those outputs aren't visible to the calling process.

src/snakemake/io.py (3)

1132-1139: Good implementation of the nodelocal functionality.

The addition of the group_jobs parameter to the temp function is a clean way to implement the nodelocal functionality. The default value of False ensures backward compatibility, and the documentation clearly explains the purpose and behavior of this parameter.


1146-1148: Correct implementation of the nodelocal flag.

The implementation correctly flags files as "nodelocal" when group_jobs is True, providing a simple way for users to mark files that should only exist on compute nodes.


1227-1228: Good check for mutually exclusive flags.

This check properly prevents files from being both protected and nodelocal, which would be contradictory - protected files are meant to be preserved while nodelocal files are temporary and node-specific.

src/snakemake/dag.py (7)

932-933: LGTM: Excluding nodelocal files from storage.

The addition of the nodelocal flag check prevents these node-specific files from being stored in centralized storage, which aligns with the feature's purpose.


1775-1780: LGTM: Tracking nodelocal files alongside pipes and services.

Adding the has_nodelocal tracking variable and check for nodelocal-flagged files ensures proper handling of these node-specific resources.


1791-1795: LGTM: Separate state tracking for nodelocal files.

Correctly separates the state tracking between pipe/service files and nodelocal files, allowing for different behaviors in the handling logic.


1808-1809: LGTM: Allowing nodelocal files different execution constraints.

These changes properly exempt nodelocal files from the constraints that apply to pipes and services:

  1. Nodelocal files are allowed to exist without consumers
  2. Nodelocal files can be consumed by rules with 'run' directives

This flexibility is appropriate since nodelocal files represent persistent resources that exist within a node's environment.

Also applies to: 1823-1833


1840-1841: LGTM: Including nodelocal in flow control and error messages.

The flow control and error message changes properly incorporate nodelocal files in the workflow logic.

Also applies to: 1847-1849


1853-1883: LGTM: Serial execution strategy for nodelocal files.

This implementation correctly distinguishes between pipe/service files and nodelocal files by:

  1. Creating pipe groups for pipe/service files (parallel execution)
  2. Creating user groups for nodelocal files (serial execution)

The serial execution approach addresses the issue mentioned in the PR description where Snakemake attempts to schedule multiple jobs simultaneously when they should be executed sequentially.


2637-2639: LGTM: Added status reporting for nodelocal files.

Adding the "file local to node" status in the summary method provides appropriate visibility when examining workflow execution status.

@DrYak
Copy link
Copy Markdown
Contributor Author

DrYak commented Mar 14, 2025

Just basically a different interface bbut same functionality.

Interface updated as requested:

temp("…file…", group_jobs=True)

is now the way to put the flag "nodelocal" on a file.

(Under-the-hood logic remains the same, docstring of "temp" explains the internal "nodelocal" flag, main documentation updated).

@DrYak DrYak requested a review from johanneskoester March 14, 2025 10:13
@johanneskoester johanneskoester changed the title feat: nodelocal flag for files only accessible e.g. in the compute node of an HPC running a group. feat: optionally auto-group jobs via temp files in case of remote execution Mar 14, 2025
@johanneskoester johanneskoester enabled auto-merge (squash) March 14, 2025 10:43
@johanneskoester johanneskoester merged commit cc9bba2 into snakemake:main Mar 14, 2025
49 checks passed
@github-project-automation github-project-automation bot moved this from In review to Done in Snakemake Hackathon March 2025 Mar 14, 2025
@DrYak
Copy link
Copy Markdown
Contributor Author

DrYak commented Mar 14, 2025

🎊

johanneskoester pushed a commit that referenced this pull request Mar 14, 2025
🤖 I have created a release *beep* *boop*
---


##
[9.0.0](v8.30.0...v9.0.0)
(2025-03-14)


### ⚠ BREAKING CHANGES

* Logging refactor & add LoggerPluginInterface
([#3107](#3107))

### Features

* [#3412](#3412) - keep
shadow folder of failed job if --keep-incomplete flag is set.
([#3430](#3430))
([22978c3](22978c3))
* add flag --report-after-run to automatically generate the report after
a successfull workflow run
([#3428](#3428))
([b0a7f03](b0a7f03))
* add flatten function to IO utils
([#3424](#3424))
([67fa392](67fa392))
* add helper functions to parse input files
([#2918](#2918))
([63e45a7](63e45a7))
* Add option to print redacted file names
([#3089](#3089))
([ba4d264](ba4d264))
* add support for validation of polars dataframe and lazyframe
([#3262](#3262))
([c7473a6](c7473a6))
* added support for rendering dag with mermaid js
([#3409](#3409))
([7bf8381](7bf8381))
* adding --replace-workflow-config to fully replace workflow configs
(from config: directive) with --configfile, instead of merging them
([#3381](#3381))
([47504a0](47504a0))
* Dynamic module name
([#3401](#3401))
([024dc32](024dc32))
* Enable saving and reloading IOCache object
([#3386](#3386))
([c935953](c935953))
* files added in rule params with workflow.source_path will be available
in used containers
([#3385](#3385))
([a6e45bf](a6e45bf))
* Fix keep_local in storage directive and more freedom over remote
retrieval behaviour
([#3410](#3410))
([67b4739](67b4739))
* inherit parameters of use rule and extend/replace individual items
them when using 'with' directive
([#3365](#3365))
([93e4b92](93e4b92))
* Logging refactor & add LoggerPluginInterface
([#3107](#3107))
([86f1d6e](86f1d6e))
* Maximal file size for checksums
([#3368](#3368))
([b039f8a](b039f8a))
* Modernize package configuration using Pixi
([#3369](#3369))
([77992d8](77992d8))
* multiext support for named input/output
([#3372](#3372))
([05e1378](05e1378))
* optionally auto-group jobs via temp files in case of remote execution
([#3378](#3378))
([cc9bba2](cc9bba2))


### Bug Fixes

* `--delete-all-output` ignores `--dry-run`
([#3265](#3265))
([23fef82](23fef82))
* 3342 faster touch runs and warning messages for non-existing files
([#3398](#3398))
([cd9c3c3](cd9c3c3))
* add default value to max-jobs-per-timespan
([#3043](#3043))
([2959abe](2959abe))
* checkpoints inside modules are overwritten
([#3359](#3359))
([fba3ac7](fba3ac7))
* Convert Path to IOFile
([#3405](#3405))
([c58684c](c58684c))
* Do not perform storage object cleanup with --keep-storage-local-copies
set ([#3358](#3358))
([9a6d14b](9a6d14b))
* edgecases of source deployment in case of remote execution
([#3396](#3396))
([5da13be](5da13be))
* enhance error message formatting for strict DAG-building mode
([#3376](#3376))
([a1c39ee](a1c39ee))
* fix bug in checkpoint handling that led to exceptions in case
checkpoint output was missing upon rerun
([#3423](#3423))
([8cf4a2f](8cf4a2f))
* force check all required outputs
([#3341](#3341))
([495a4e7](495a4e7))
* group job formatting
([#3442](#3442))
([f0b10a3](f0b10a3))
* in remote jobs, upload storage in topological order such that
modification dates are preserved (e.g. in case of group jobs)
([#3377](#3377))
([eace08f](eace08f))
* only skip eval when resource depends on input
([#3374](#3374))
([4574c92](4574c92))
* Prevent execution of conda in apptainer when not explicitly requested
in software deployment method
([#3388](#3388))
([c43c5c0](c43c5c0))
* print filenames with quotes around them in RuleException
([#3269](#3269))
([6baeda5](6baeda5))
* Re-evaluation of free resources
([#3399](#3399))
([6371293](6371293))
* ReadTheDocs layout issue due to src directory change
([#3419](#3419))
([695b127](695b127))
* robustly escaping quotes in generated bash scripts (v2)
([#3297](#3297))
([#3389](#3389))
([58720bd](58720bd))
* Show apptainer image URL in snakemake report
([#3407](#3407))
([45f0450](45f0450))
* Update ReadTheDocs configuration for documentation build to use Pixi
([#3433](#3433))
([3f227a6](3f227a6))


### Documentation

* Add pixi setup instructions to general use tutorial
([#3382](#3382))
([115e81b](115e81b))
* fix contribution section heading levels, fix docs testing setup order
([#3360](#3360))
([051dc53](051dc53))
* fix link to github.com/snakemake/poetry-snakemake-plugin
([#3436](#3436))
([ec6d97c](ec6d97c))
* fix quoting
([#3394](#3394))
([b40f599](b40f599))
* fix rerun-triggers default
([#3403](#3403))
([4430e23](4430e23))
* fix typo 'safe' -&gt; 'save'
([#3384](#3384))
([7755861](7755861))
* mention code formatting in the contribution section
([#3431](#3431))
([e8682b7](e8682b7))
* remove duplicated 'functions'.
([#3356](#3356))
([7c595db](7c595db))
* update broken links documentation
([#3437](#3437))
([e3d0d88](e3d0d88))
* Updating contributing guidelines with new pixi dev setup
([#3415](#3415))
([8e95a12](8e95a12))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: snakemake-bot <snakemake-bot-admin@googlegroups.com>
kjohnsen pushed a commit to kjohnsen/snakemake that referenced this pull request Dec 15, 2025
…cution (snakemake#3378)

<!--Add a description of your PR here-->

issue snakemake#3355 

creates a new flag to flag files that are only accessible within a group
running on a compute node, and note accessible to the main process
(somewhat reminiscent as how pipes and services are grouped together but
with regular files and directories on paths that are local to that
machin, e.g., that machine's /tmp)

### QC
<!-- Make sure that you can tick the boxes below. -->

* [x] The PR contains a test case for the changes or the changes are
already covered by an existing test case.
* [x] The documentation (`docs/`) is updated to reflect the changes or
this is not necessary (e.g. if the change does neither modify the
language nor the behavior or functionalities of Snakemake).


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced support for node-local storage workflows, enabling local
handling of temporary and persistent files.
- Enabled job grouping for tasks that leverage local storage to optimize
performance.

- **Documentation**
- Added a new section detailing job-grouped temporary files to guide
efficient usage of local storage.

- **Refactor**
- Improved error handling and file flag management to prevent conflicts
and ensure smoother operations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Johannes Köster <johannes.koester@uni-due.de>
Co-authored-by: Johannes Köster <johannes.koester@tu-dortmund.de>
kjohnsen pushed a commit to kjohnsen/snakemake that referenced this pull request Dec 15, 2025
🤖 I have created a release *beep* *boop*
---


##
[9.0.0](snakemake/snakemake@v8.30.0...v9.0.0)
(2025-03-14)


### ⚠ BREAKING CHANGES

* Logging refactor & add LoggerPluginInterface
([snakemake#3107](snakemake#3107))

### Features

* [snakemake#3412](snakemake#3412) - keep
shadow folder of failed job if --keep-incomplete flag is set.
([snakemake#3430](snakemake#3430))
([22978c3](snakemake@22978c3))
* add flag --report-after-run to automatically generate the report after
a successfull workflow run
([snakemake#3428](snakemake#3428))
([b0a7f03](snakemake@b0a7f03))
* add flatten function to IO utils
([snakemake#3424](snakemake#3424))
([67fa392](snakemake@67fa392))
* add helper functions to parse input files
([snakemake#2918](snakemake#2918))
([63e45a7](snakemake@63e45a7))
* Add option to print redacted file names
([snakemake#3089](snakemake#3089))
([ba4d264](snakemake@ba4d264))
* add support for validation of polars dataframe and lazyframe
([snakemake#3262](snakemake#3262))
([c7473a6](snakemake@c7473a6))
* added support for rendering dag with mermaid js
([snakemake#3409](snakemake#3409))
([7bf8381](snakemake@7bf8381))
* adding --replace-workflow-config to fully replace workflow configs
(from config: directive) with --configfile, instead of merging them
([snakemake#3381](snakemake#3381))
([47504a0](snakemake@47504a0))
* Dynamic module name
([snakemake#3401](snakemake#3401))
([024dc32](snakemake@024dc32))
* Enable saving and reloading IOCache object
([snakemake#3386](snakemake#3386))
([c935953](snakemake@c935953))
* files added in rule params with workflow.source_path will be available
in used containers
([snakemake#3385](snakemake#3385))
([a6e45bf](snakemake@a6e45bf))
* Fix keep_local in storage directive and more freedom over remote
retrieval behaviour
([snakemake#3410](snakemake#3410))
([67b4739](snakemake@67b4739))
* inherit parameters of use rule and extend/replace individual items
them when using 'with' directive
([snakemake#3365](snakemake#3365))
([93e4b92](snakemake@93e4b92))
* Logging refactor & add LoggerPluginInterface
([snakemake#3107](snakemake#3107))
([86f1d6e](snakemake@86f1d6e))
* Maximal file size for checksums
([snakemake#3368](snakemake#3368))
([b039f8a](snakemake@b039f8a))
* Modernize package configuration using Pixi
([snakemake#3369](snakemake#3369))
([77992d8](snakemake@77992d8))
* multiext support for named input/output
([snakemake#3372](snakemake#3372))
([05e1378](snakemake@05e1378))
* optionally auto-group jobs via temp files in case of remote execution
([snakemake#3378](snakemake#3378))
([cc9bba2](snakemake@cc9bba2))


### Bug Fixes

* `--delete-all-output` ignores `--dry-run`
([snakemake#3265](snakemake#3265))
([23fef82](snakemake@23fef82))
* 3342 faster touch runs and warning messages for non-existing files
([snakemake#3398](snakemake#3398))
([cd9c3c3](snakemake@cd9c3c3))
* add default value to max-jobs-per-timespan
([snakemake#3043](snakemake#3043))
([2959abe](snakemake@2959abe))
* checkpoints inside modules are overwritten
([snakemake#3359](snakemake#3359))
([fba3ac7](snakemake@fba3ac7))
* Convert Path to IOFile
([snakemake#3405](snakemake#3405))
([c58684c](snakemake@c58684c))
* Do not perform storage object cleanup with --keep-storage-local-copies
set ([snakemake#3358](snakemake#3358))
([9a6d14b](snakemake@9a6d14b))
* edgecases of source deployment in case of remote execution
([snakemake#3396](snakemake#3396))
([5da13be](snakemake@5da13be))
* enhance error message formatting for strict DAG-building mode
([snakemake#3376](snakemake#3376))
([a1c39ee](snakemake@a1c39ee))
* fix bug in checkpoint handling that led to exceptions in case
checkpoint output was missing upon rerun
([snakemake#3423](snakemake#3423))
([8cf4a2f](snakemake@8cf4a2f))
* force check all required outputs
([snakemake#3341](snakemake#3341))
([495a4e7](snakemake@495a4e7))
* group job formatting
([snakemake#3442](snakemake#3442))
([f0b10a3](snakemake@f0b10a3))
* in remote jobs, upload storage in topological order such that
modification dates are preserved (e.g. in case of group jobs)
([snakemake#3377](snakemake#3377))
([eace08f](snakemake@eace08f))
* only skip eval when resource depends on input
([snakemake#3374](snakemake#3374))
([4574c92](snakemake@4574c92))
* Prevent execution of conda in apptainer when not explicitly requested
in software deployment method
([snakemake#3388](snakemake#3388))
([c43c5c0](snakemake@c43c5c0))
* print filenames with quotes around them in RuleException
([snakemake#3269](snakemake#3269))
([6baeda5](snakemake@6baeda5))
* Re-evaluation of free resources
([snakemake#3399](snakemake#3399))
([6371293](snakemake@6371293))
* ReadTheDocs layout issue due to src directory change
([snakemake#3419](snakemake#3419))
([695b127](snakemake@695b127))
* robustly escaping quotes in generated bash scripts (v2)
([snakemake#3297](snakemake#3297))
([snakemake#3389](snakemake#3389))
([58720bd](snakemake@58720bd))
* Show apptainer image URL in snakemake report
([snakemake#3407](snakemake#3407))
([45f0450](snakemake@45f0450))
* Update ReadTheDocs configuration for documentation build to use Pixi
([snakemake#3433](snakemake#3433))
([3f227a6](snakemake@3f227a6))


### Documentation

* Add pixi setup instructions to general use tutorial
([snakemake#3382](snakemake#3382))
([115e81b](snakemake@115e81b))
* fix contribution section heading levels, fix docs testing setup order
([snakemake#3360](snakemake#3360))
([051dc53](snakemake@051dc53))
* fix link to github.com/snakemake/poetry-snakemake-plugin
([snakemake#3436](snakemake#3436))
([ec6d97c](snakemake@ec6d97c))
* fix quoting
([snakemake#3394](snakemake#3394))
([b40f599](snakemake@b40f599))
* fix rerun-triggers default
([snakemake#3403](snakemake#3403))
([4430e23](snakemake@4430e23))
* fix typo 'safe' -&gt; 'save'
([snakemake#3384](snakemake#3384))
([7755861](snakemake@7755861))
* mention code formatting in the contribution section
([snakemake#3431](snakemake#3431))
([e8682b7](snakemake@e8682b7))
* remove duplicated 'functions'.
([snakemake#3356](snakemake#3356))
([7c595db](snakemake@7c595db))
* update broken links documentation
([snakemake#3437](snakemake#3437))
([e3d0d88](snakemake@e3d0d88))
* Updating contributing guidelines with new pixi dev setup
([snakemake#3415](snakemake#3415))
([8e95a12](snakemake@8e95a12))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: snakemake-bot <snakemake-bot-admin@googlegroups.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request hackathon-2025

Projects

Development

Successfully merging this pull request may close these issues.

introduce localtemp (temp files that shall end up on local storage and lead to auto-grouping like pipe output does)

2 participants