Skip to content

fix: remove temp files when using checkpoints#3716

Merged
johanneskoester merged 5 commits intosnakemake:mainfrom
danpal96:temp-checkpoint
Sep 4, 2025
Merged

fix: remove temp files when using checkpoints#3716
johanneskoester merged 5 commits intosnakemake:mainfrom
danpal96:temp-checkpoint

Conversation

@danpal96
Copy link
Copy Markdown
Contributor

@danpal96 danpal96 commented Aug 28, 2025

  • Added _handle_temp_jobs instance attribute to the DAG class to track jobs whose temp files have not yet been removed via handle_temp.
  • Modified the finish method to wait until all checkpoint jobs are done before removing the temp files stored in _handle_temp_jobs.

QC

  • The PR contains a test case for the changes or the changes are already covered by an existing test case.
  • The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

Summary by CodeRabbit

  • Bug Fixes

    • Fixes handling of temporary files during checkpoint-driven runs by deferring and processing them after a pass to prevent premature removals.
  • Performance

    • Reduces redundant temp-file processing for more stable, efficient checkpoint re-evaluation.
  • Tests

    • Adds an end-to-end test and expected-result fixtures to validate checkpoint + temp-file aggregation and workflow behavior.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Aug 28, 2025

📝 Walkthrough

Walkthrough

Adds deferred temporary-file job handling to DAG.finish via a new private queue _deferred_temp_jobs and drains deferred temp jobs once no checkpoint jobs remain. Adds a checkpoint-driven test workflow, expected-result fixtures, and a unit test exercising temp/checkpoint behavior.

Changes

Cohort / File(s) Change Summary
DAG temp-file handling
src/snakemake/dag.py
Add private attribute self._deferred_temp_jobs to accumulate temp-file jobs when checkpoint jobs are present; in finish defer temp handling by appending jobs when checkpoints exist, otherwise drain and process temp jobs from chain(jobs, self._deferred_temp_jobs) and clear the deferred queue. No public API changes.
Checkpoint temp workflow & fixtures
tests/test_temp_checkpoint/Snakefile, tests/test_temp_checkpoint/expected-results/...
Add a Snakemake workflow demonstrating checkpoint-driven dynamic inputs (somestep checkpoint, temp rule, intermediate/alt_intermediate, aggregate_input(wildcards)) and expected aggregated outputs results/aggregated/a.txt, b.txt, c.txt.
Test invocation
tests/tests.py
Add test_temp_checkpoint() unit test that runs the dataset (cleanup disabled), asserts aggregated outputs (a.txt,b.txt,c.txt) match expected, and removes the temporary run directory.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor Runner as Runner
    participant DAG as DAG.finish
    participant Temp as handle_temp

    Runner->>DAG: finish(jobs)
    alt any job is checkpoint
        DAG->>DAG: append current jobs to _deferred_temp_jobs
    else no checkpoint jobs
        DAG->>DAG: for job in chain(jobs, _deferred_temp_jobs)
        DAG->>Temp: handle_temp(job)
        DAG->>DAG: clear _deferred_temp_jobs
    end
    note over DAG,Temp: Deferred temp jobs processed only when no checkpoint jobs remain
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d8f7c4a and 19911f5.

📒 Files selected for processing (1)
  • src/snakemake/dag.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/snakemake/dag.py
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@danpal96
Copy link
Copy Markdown
Contributor Author

PR for #3715

Copy link
Copy Markdown
Contributor

@johanneskoester johanneskoester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! I must admit that I don't yet understand why this is needed. Can you elaborate a bit, or provide a testcase/example where it currently fails?

@johanneskoester johanneskoester self-assigned this Sep 4, 2025
- Added `_handle_temp_jobs` instance attribute to the DAG class to track
  jobs whose temp files have not yet been removed via `handle_temp`.
- Modified the `finish` method to wait until all checkpoint jobs are done
  before removing the temp files stored in `_handle_temp_jobs`.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/tests.py (1)

2469-2480: Guard against Windows shell incompatibilities

This test relies on POSIX tools used in the Snakefile (cat, echo). To avoid CI flakes on Windows, mark it as skipped there, in line with surrounding tests.

@@
- def test_temp_checkpoint():
+@skip_on_windows
+def test_temp_checkpoint():
🧹 Nitpick comments (1)
tests/tests.py (1)

2479-2479: Make cleanup resilient on Windows

Use existing ON_WINDOWS flag to ignore removal errors, matching the pattern used elsewhere in this file.

-    shutil.rmtree(tmpdir)
+    shutil.rmtree(tmpdir, ignore_errors=ON_WINDOWS)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 37841fb and 9932de0.

📒 Files selected for processing (5)
  • tests/test_temp_checkpoint/Snakefile (1 hunks)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/a.txt (1 hunks)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/b.txt (1 hunks)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/c.txt (1 hunks)
  • tests/tests.py (1 hunks)
✅ Files skipped from review due to trivial changes (3)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/b.txt
  • tests/test_temp_checkpoint/expected-results/results/aggregated/a.txt
  • tests/test_temp_checkpoint/expected-results/results/aggregated/c.txt
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • tests/tests.py
🧠 Learnings (1)
📚 Learning: 2024-10-06T14:09:54.370Z
Learnt from: johanneskoester
PR: snakemake/snakemake#3117
File: tests/test_wrapper/Snakefile:11-11
Timestamp: 2024-10-06T14:09:54.370Z
Learning: Changes made within test cases, such as in `tests/test_wrapper/Snakefile`, are for testing purposes and do not require updates to the project documentation.

Applied to files:

  • tests/test_temp_checkpoint/Snakefile
🧬 Code graph analysis (1)
tests/tests.py (1)
tests/common.py (2)
  • run (152-496)
  • dpath (33-36)
🪛 Ruff (0.12.2)
tests/tests.py

2478-2478: Use of assert detected

(S101)

🔇 Additional comments (1)
tests/test_temp_checkpoint/Snakefile (1)

1-66: Good, targeted fixture exercising deferred temp cleanup via checkpoints

The workflow cleanly models temp outputs across checkpoint and intermediates, and aggregate_input uses checkpoints...open() correctly. Looks good.

@danpal96
Copy link
Copy Markdown
Contributor Author

danpal96 commented Sep 4, 2025

I tried to explain this in more detail in #3715, but in short, I expect that in the test below all temporary files should be removed by the end of the workflow. However, only one of ["results/temp/a.txt", "results/temp/b.txt", "results/temp/c.txt"] is actually removed. This happens because temporary files are deleted by the handle_temp method in the finish method of DAG only if all checkpoint rules have finished, and this condition is true only for the last one that completes. To address this, I created a list to store the rules for which handle_temp was not called, and defer their cleanup until no checkpoints remain. I have committed this test:

# a target rule to define the desired final output
rule all:
    input:
        "results/aggregated/a.txt",
        "results/aggregated/b.txt",
        "results/aggregated/c.txt",

rule temp:
    output:
        temp("results/temp/{sample}.txt")
    shell:
        # simulate some output value
        "echo {wildcards.sample} > {output:q}"

# the checkpoint that shall trigger re-evaluation of the DAG
checkpoint somestep:
    input:
        "results/temp/{sample}.txt"
    output:
        temp("results/somestep/{sample}.txt")
    shell:
        # simulate some output value
        "cat {input:q} > {output:q}"


# intermediate rule
rule intermediate:
    input:
        "results/somestep/{sample}.txt"
    output:
        temp("results/post/{sample}.txt")
    shell:
        "cat {input:q} > {output:q}"


# alternative intermediate rule
rule alt_intermediate:
    input:
        "results/somestep/{sample}.txt"
    output:
        temp("results/alt/{sample}.txt")
    shell:
        "cat {input:q} > {output:q}"


# input function for the rule aggregate
def aggregate_input(wildcards):
    # decision based on content of output file
    # Important: use the method open() of the returned file!
    # This way, Snakemake is able to automatically download the file if it is generated in
    # a cloud environment without a shared filesystem.
    with checkpoints.somestep.get(sample=wildcards.sample).output[0].open() as f:
        if f.read().strip() == "a":
            return "results/post/{sample}.txt"
        else:
            return "results/alt/{sample}.txt"


rule aggregate:
    input:
        aggregate_input
    output:
        "results/aggregated/{sample}.txt"
    shell:
        "cat {input:q} > {output:q}"

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
tests/tests.py (2)

2495-2496: Filter to files only to avoid spurious matches

Guard against directories being included by glob in edge cases.

-    real = set(tmpdir.glob("results/*/*"))
+    real = {p for p in tmpdir.glob("results/*/*") if p.is_file()}

2497-2497: Harden cleanup on Windows

Match existing pattern in this test module to avoid intermittent rmtree failures on Windows.

-    shutil.rmtree(tmpdir)
+    shutil.rmtree(tmpdir, ignore_errors=ON_WINDOWS)
tests/test_temp_checkpoint/Snakefile (1)

47-56: Prefer explicit formatting over brace placeholders in input function

Returning fully formatted paths is a bit clearer and avoids relying on implicit wildcard substitution.

 def aggregate_input(wildcards):
@@
-    with checkpoints.somestep.get(sample=wildcards.sample).output[0].open() as f:
-        if f.read().strip() == "a":
-            return "results/post/{sample}.txt"
-        else:
-            return "results/alt/{sample}.txt"
+    with checkpoints.somestep.get(sample=wildcards.sample).output[0].open() as f:
+        if f.read().strip() == "a":
+            return f"results/post/{wildcards.sample}.txt"
+        else:
+            return f"results/alt/{wildcards.sample}.txt"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9932de0 and 1e4cb11.

📒 Files selected for processing (6)
  • src/snakemake/dag.py (2 hunks)
  • tests/test_temp_checkpoint/Snakefile (1 hunks)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/a.txt (1 hunks)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/b.txt (1 hunks)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/c.txt (1 hunks)
  • tests/tests.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/b.txt
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/c.txt
  • tests/test_temp_checkpoint/expected-results/results/aggregated/a.txt
  • src/snakemake/dag.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • tests/tests.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: johanneskoester
PR: snakemake/snakemake#3096
File: snakemake/dag.py:1943-1943
Timestamp: 2024-09-19T18:47:50.192Z
Learning: In the temp file deletion logic, temp files are only deleted after confirming that no other job (including non-checkpoint jobs) still needs them.
Learnt from: johanneskoester
PR: snakemake/snakemake#3096
File: snakemake/dag.py:1943-1943
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In the temp file deletion logic, temp files are only deleted after confirming that no other job (including non-checkpoint jobs) still needs them.
📚 Learning: 2024-10-06T14:09:54.370Z
Learnt from: johanneskoester
PR: snakemake/snakemake#3117
File: tests/test_wrapper/Snakefile:11-11
Timestamp: 2024-10-06T14:09:54.370Z
Learning: Changes made within test cases, such as in `tests/test_wrapper/Snakefile`, are for testing purposes and do not require updates to the project documentation.

Applied to files:

  • tests/test_temp_checkpoint/Snakefile
🧬 Code graph analysis (1)
tests/tests.py (1)
tests/common.py (2)
  • run (152-503)
  • dpath (33-36)
🪛 Ruff (0.12.2)
tests/tests.py

2496-2496: Use of assert detected

(S101)

🔇 Additional comments (2)
tests/tests.py (1)

2487-2497: Good targeted test for checkpoint-driven temp cleanup

Covers that only final aggregated files remain, aligning with the intended deferred temp handling behavior.

tests/test_temp_checkpoint/Snakefile (1)

16-44: Checkpoint and temp usage LGTM

The checkpoint plus temp-marked intermediates model the issue well; using IOFile.open() is correct for remote/shared FS scenarios.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
tests/tests.py (2)

2487-2488: Skip on Windows to avoid shell incompatibilities

Snakefile uses cat/echo; many tests already skip on Windows for this reason. Recommend marking this test similarly to prevent spurious CI failures.

Apply:

- def test_temp_checkpoint():
+@skip_on_windows
+def test_temp_checkpoint():

2496-2496: Silence Ruff S101 in tests or adjust config

Ruff flags assert (S101). In tests this is fine; either ignore S101 for tests in config, or add a line-level noqa to keep lint clean.

One-line fix:

-    assert expected == real, "temp files not removed"
+    assert expected == real, "temp files not removed"  # noqa: S101
tests/test_temp_checkpoint/Snakefile (1)

8-24: Shell portability note (optional)

The rules rely on echo/cat. If you decide not to skip the test on Windows, consider setting a POSIX shell explicitly in the test context instead. Otherwise, keeping the test skipped on Windows is fine.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1e4cb11 and 2e204d9.

📒 Files selected for processing (5)
  • tests/test_temp_checkpoint/Snakefile (1 hunks)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/a.txt (1 hunks)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/b.txt (1 hunks)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/c.txt (1 hunks)
  • tests/tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/a.txt
  • tests/test_temp_checkpoint/expected-results/results/aggregated/c.txt
  • tests/test_temp_checkpoint/expected-results/results/aggregated/b.txt
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • tests/tests.py
🧠 Learnings (1)
📓 Common learnings
Learnt from: johanneskoester
PR: snakemake/snakemake#3096
File: snakemake/dag.py:1943-1943
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In the temp file deletion logic, temp files are only deleted after confirming that no other job (including non-checkpoint jobs) still needs them.
Learnt from: johanneskoester
PR: snakemake/snakemake#3096
File: snakemake/dag.py:1943-1943
Timestamp: 2024-09-19T18:47:50.192Z
Learning: In the temp file deletion logic, temp files are only deleted after confirming that no other job (including non-checkpoint jobs) still needs them.
🧬 Code graph analysis (1)
tests/tests.py (1)
tests/common.py (2)
  • run (152-503)
  • dpath (33-36)
🪛 Ruff (0.12.2)
tests/tests.py

2496-2496: Use of assert detected

(S101)

🔇 Additional comments (2)
tests/tests.py (1)

2487-2497: Good regression coverage for checkpoint + temp cleanup

Solid, minimal repro that asserts only final aggregated artifacts remain. This directly guards against the bug from #3715 and aligns with the learning that temp files must be deleted only when no job still needs them.

tests/test_temp_checkpoint/Snakefile (1)

46-57: Correct checkpoint usage with open() and deferred temp intermediates

Nice use of checkpoints.somestep.get(...).output[0].open() and temp-marked intermediates to exercise the cleanup path. This matches the intended “delete temp only when safe” behavior.

@johanneskoester johanneskoester enabled auto-merge (squash) September 4, 2025 15:12
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (6)
src/snakemake/dag.py (2)

126-126: Initialize deferred temp-jobs queue — OK.

Optional: consider a set to dedupe if the same job could be enqueued multiple times (shouldn’t happen, but cheap protection).


2279-2282: Optional: batch deletions concurrently.

When draining many deferred jobs, running handle_temp sequentially can be slow on I/O-heavy workflows. Consider a small TaskGroup for parallelism.

Example:

# replace the for-loop body with:
async with asyncio.TaskGroup() as tg:
    for job in live_jobs:
        tg.create_task(self.handle_temp(job))
tests/tests.py (3)

2491-2491: Drop redundant Path() wrap

run(..., cleanup=False) already returns a Path (see tests/common.py). The extra conversion is unnecessary.

Apply:

-    tmpdir = run(dpath("test_temp_checkpoint"), cleanup=False)
-    tmpdir = Path(tmpdir)
+    tmpdir = run(dpath("test_temp_checkpoint"), cleanup=False)

2497-2497: Ruff S101 in tests: add a local ignore if needed

Using assert in tests is fine. If your linter flags it, silence just this instance.

-    assert expected == real, "temp files not removed"
+    assert expected == real, "temp files not removed"  # noqa: S101

2498-2498: Harmonize tmpdir cleanup with Windows behavior

Most tests use ignore_errors=ON_WINDOWS. Align for consistency and to avoid occasional removal errors on Windows runners (even though this test is skipped there).

-    shutil.rmtree(tmpdir)
+    shutil.rmtree(tmpdir, ignore_errors=ON_WINDOWS)
tests/test_temp_checkpoint/Snakefile (1)

47-56: Minor: return templated path vs formatted string is intentional—keep as-is

Returning "results/post/{sample}.txt" (templated) is preferable here so Snakemake can expand wildcards; avoid switching to f-strings that hard-substitute the value.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2e204d9 and d8f7c4a.

📒 Files selected for processing (6)
  • src/snakemake/dag.py (2 hunks)
  • tests/test_temp_checkpoint/Snakefile (1 hunks)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/a.txt (1 hunks)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/b.txt (1 hunks)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/c.txt (1 hunks)
  • tests/tests.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/test_temp_checkpoint/expected-results/results/aggregated/a.txt
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • src/snakemake/dag.py
  • tests/tests.py
🧠 Learnings (3)
📓 Common learnings
Learnt from: johanneskoester
PR: snakemake/snakemake#3096
File: snakemake/dag.py:1943-1943
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In the temp file deletion logic, temp files are only deleted after confirming that no other job (including non-checkpoint jobs) still needs them.
Learnt from: johanneskoester
PR: snakemake/snakemake#3096
File: snakemake/dag.py:1943-1943
Timestamp: 2024-09-19T18:47:50.192Z
Learning: In the temp file deletion logic, temp files are only deleted after confirming that no other job (including non-checkpoint jobs) still needs them.
📚 Learning: 2024-10-08T17:41:54.542Z
Learnt from: johanneskoester
PR: snakemake/snakemake#3096
File: snakemake/dag.py:1943-1943
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In the temp file deletion logic, temp files are only deleted after confirming that no other job (including non-checkpoint jobs) still needs them.

Applied to files:

  • src/snakemake/dag.py
📚 Learning: 2024-10-06T14:09:54.370Z
Learnt from: johanneskoester
PR: snakemake/snakemake#3117
File: tests/test_wrapper/Snakefile:11-11
Timestamp: 2024-10-06T14:09:54.370Z
Learning: Changes made within test cases, such as in `tests/test_wrapper/Snakefile`, are for testing purposes and do not require updates to the project documentation.

Applied to files:

  • tests/test_temp_checkpoint/Snakefile
🧬 Code graph analysis (1)
tests/tests.py (1)
tests/common.py (2)
  • run (152-503)
  • dpath (33-36)
🪛 Ruff (0.12.2)
tests/tests.py

2497-2497: Use of assert detected

(S101)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: tests (9, macos-latest, py312)
  • GitHub Check: tests (8, macos-latest, py313)
  • GitHub Check: tests (8, macos-latest, py312)
  • GitHub Check: tests (4, macos-latest, py313)
  • GitHub Check: tests (2, windows-latest, py313)
  • GitHub Check: tests (2, windows-latest, py312)
🔇 Additional comments (4)
tests/test_temp_checkpoint/expected-results/results/aggregated/c.txt (1)

1-1: Fixture looks good.

Matches the test’s expectation for aggregated output.

tests/test_temp_checkpoint/expected-results/results/aggregated/b.txt (1)

1-1: Fixture looks good.

Matches the test’s expectation for aggregated output.

tests/tests.py (1)

2487-2498: Solid regression test to assert full temp cleanup with checkpoints

  • Skips on Windows (good; uses Unix tools).
  • Asserts only the three aggregated outputs remain, matching the intended behavior after deferred temp handling.
tests/test_temp_checkpoint/Snakefile (1)

1-66: Well-constructed checkpoint workflow to exercise deferred temp handling

  • Correctly marks all intermediates (temp, somestep, post, alt) as temp, ensuring they’re candidates for cleanup.
  • Uses checkpoints.somestep.get(...).output[0].open() in aggregate_input, which is the right pattern for remote/shared FS compatibility.
  • Dynamic routing to post/alt based on content neatly reproduces the original bug scenario.

@johanneskoester johanneskoester merged commit 5ff3e20 into snakemake:main Sep 4, 2025
80 checks passed
johanneskoester pushed a commit that referenced this pull request Sep 5, 2025
🤖 I have created a release *beep* *boop*
---


##
[9.11.0](v9.10.1...v9.11.0)
(2025-09-05)


### Features

* add landing page to the report containing custom metadata defined with
a YTE yaml template
([#3452](#3452))
([c754d80](c754d80))
* Issue a warning when unsupported characters are used in wildcard names
([#1587](#1587))
([fa57355](fa57355))


### Bug Fixes

* avoid checking output files in immediate-submit mode
([#3569](#3569))
([58c42cf](58c42cf))
* clarify --keep-going flag help text to distinguish runtime vs DAG
construction errors
([#3705](#3705))
([a3a8ef4](a3a8ef4))
* enable docstring assignment in `use rule ... with:` block
([#3710](#3710))
([2191962](2191962))
* fix missing attribute error in greedy scheduler settings when using
touch, dryrun or immediate submit.
([6471004](6471004))
* only trigger script with CODE label
([#3707](#3707))
([2d01f8c](2d01f8c))
* parser.py incorrectly parsing multiline f-strings
([#3701](#3701))
([06ece76](06ece76))
* parsing ordinary yaml strings
([#3506](#3506))
([ddf334e](ddf334e))
* remove temp files when using checkpoints
([#3716](#3716))
([5ff3e20](5ff3e20))
* Restore python rules changes triggering reruns. (GH:
[#3213](#3213))
([#3485](#3485))
([2f663be](2f663be))
* unit testing
([#3680](#3680))
([06ba7e6](06ba7e6))
* use queue_input_wait_time when updating queue input jobs
([#3712](#3712))
([a945a19](a945a19))


### Documentation

* output files within output directories is an error
([#2848](#2848))
([#2913](#2913))
([37272e5](37272e5))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
kjohnsen pushed a commit to kjohnsen/snakemake that referenced this pull request Dec 15, 2025
- Added `_handle_temp_jobs` instance attribute to the DAG class to track
jobs whose temp files have not yet been removed via `handle_temp`.
- Modified the `finish` method to wait until all checkpoint jobs are
done before removing the temp files stored in `_handle_temp_jobs`.

<!--Add a description of your PR here-->

### QC
<!-- Make sure that you can tick the boxes below. -->

* [x] The PR contains a test case for the changes or the changes are
already covered by an existing test case.
* [x] The documentation (`docs/`) is updated to reflect the changes or
this is not necessary (e.g. if the change does neither modify the
language nor the behavior or functionalities of Snakemake).


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Temporary-file handling is deferred during a job pass and queued items
are drained afterward to prevent premature deletions.

* **Performance**
* Reduces redundant temp-file processing for more stable, efficient
checkpoint-driven runs.

* **Tests**
* Adds an end-to-end checkpoint+temp-file test with expected-result
fixtures to validate aggregation and workflow behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Daniel Esteban Palma Igor <daniel.palma.i@ug.uchile.cl>
Co-authored-by: Johannes Koester <johannes.koester@uni-due.de>
kjohnsen pushed a commit to kjohnsen/snakemake that referenced this pull request Dec 15, 2025
🤖 I have created a release *beep* *boop*
---


##
[9.11.0](snakemake/snakemake@v9.10.1...v9.11.0)
(2025-09-05)


### Features

* add landing page to the report containing custom metadata defined with
a YTE yaml template
([snakemake#3452](snakemake#3452))
([c754d80](snakemake@c754d80))
* Issue a warning when unsupported characters are used in wildcard names
([snakemake#1587](snakemake#1587))
([fa57355](snakemake@fa57355))


### Bug Fixes

* avoid checking output files in immediate-submit mode
([snakemake#3569](snakemake#3569))
([58c42cf](snakemake@58c42cf))
* clarify --keep-going flag help text to distinguish runtime vs DAG
construction errors
([snakemake#3705](snakemake#3705))
([a3a8ef4](snakemake@a3a8ef4))
* enable docstring assignment in `use rule ... with:` block
([snakemake#3710](snakemake#3710))
([2191962](snakemake@2191962))
* fix missing attribute error in greedy scheduler settings when using
touch, dryrun or immediate submit.
([6471004](snakemake@6471004))
* only trigger script with CODE label
([snakemake#3707](snakemake#3707))
([2d01f8c](snakemake@2d01f8c))
* parser.py incorrectly parsing multiline f-strings
([snakemake#3701](snakemake#3701))
([06ece76](snakemake@06ece76))
* parsing ordinary yaml strings
([snakemake#3506](snakemake#3506))
([ddf334e](snakemake@ddf334e))
* remove temp files when using checkpoints
([snakemake#3716](snakemake#3716))
([5ff3e20](snakemake@5ff3e20))
* Restore python rules changes triggering reruns. (GH:
[snakemake#3213](snakemake#3213))
([snakemake#3485](snakemake#3485))
([2f663be](snakemake@2f663be))
* unit testing
([snakemake#3680](snakemake#3680))
([06ba7e6](snakemake@06ba7e6))
* use queue_input_wait_time when updating queue input jobs
([snakemake#3712](snakemake#3712))
([a945a19](snakemake@a945a19))


### Documentation

* output files within output directories is an error
([snakemake#2848](snakemake#2848))
([snakemake#2913](snakemake#2913))
([37272e5](snakemake@37272e5))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants