Skip to content

fix: backup output files marked for update and restore them in case of job failures#3833

Merged
johanneskoester merged 3 commits into
mainfrom
fix/restore-update-flagged
Nov 7, 2025
Merged

fix: backup output files marked for update and restore them in case of job failures#3833
johanneskoester merged 3 commits into
mainfrom
fix/restore-update-flagged

Conversation

@johanneskoester

@johanneskoester johanneskoester commented Nov 7, 2025

Copy link
Copy Markdown
Contributor

Description

QC

  • The PR contains a test case for the changes or the changes are already covered by an existing test case.
  • The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

Summary by CodeRabbit

  • Documentation

    • Clarified update flag: files/directories are not removed before execution and previous versions are restored if a job fails.
  • New Features

    • Added backup/restore handling for outputs marked with "update" to preserve and recover prior content on failure.
  • Bug Fixes

    • Changed cleanup behavior to avoid removing outputs flagged as updates prematurely.
  • Tests

    • Added a test covering a failing update-flag scenario.

@github-actions

github-actions Bot commented Nov 7, 2025

Copy link
Copy Markdown
Contributor

Please format your code with pixi run format

@coderabbitai

coderabbitai Bot commented Nov 7, 2025

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Adds a backup-and-restore workflow for outputs marked with the update flag: outputs are backed up before execution, restored on job failure, and cleaned up on success. Persistence gains backup methods and jobs coordinate backups during prepare/cleanup/postprocess; docs and tests updated.

Changes

Cohort / File(s) Summary
Documentation
docs/snakefiles/rules.rst
Clarified the update flag: Snakemake will not delete files/dirs before job execution and will restore the previous version if the job fails.
Persistence / Backups
src/snakemake/persistence.py
Added backup infrastructure: _backup_path attribute and methods backup_output, restore_output, cleanup_backup, and _get_backup_path to compute and manage backup locations.
Job lifecycle
src/snakemake/jobs.py
Annotated self.rule: Rule. Modified prepare() to back up outputs flagged update, adjusted cleanup() to skip removal of update outputs, and updated postprocess() to restore or remove backups depending on success/failure.
Tests & fixtures
tests/tests.py, tests/test_update_flag_fail/*
Added test_update_flag_fail() and test scenario files (Snakefile, in.txt, test.txt, expected-results) exercising failing rule that uses update("test.txt").

Sequence Diagram(s)

sequenceDiagram
    participant Job
    participant Persistence
    participant FS as File System

    rect rgb(245, 250, 255)
    note over Job,Persistence: Prepare
    Job->>Persistence: backup_output(path) [for outputs with update]
    Persistence->>FS: copy/move output -> backups/...
    end

    rect rgb(255, 255, 240)
    note over Job: Execute job
    Job->>Job: run()
    end

    alt Success
        rect rgb(220, 255, 220)
        note over Job,Persistence: Postprocess (success)
        Job->>Persistence: cleanup_backup(path)
        Persistence->>FS: remove backups/...
        end
    else Failure
        rect rgb(255, 230, 230)
        note over Job,Persistence: Postprocess (failure)
        Job->>Persistence: restore_output(path)
        Persistence->>FS: remove failed target (if dir), move backup -> original
        Persistence->>FS: remove backups/...
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

  • Focus review: src/snakemake/jobs.py prepare/cleanup/postprocess changes and their ordering.
  • Verify: backup path computation in src/snakemake/persistence.py::_get_backup_path (absolute vs relative), and safe handling of directory vs file backups.
  • Check tests: tests/test_update_flag_fail/* properly assert restoration on failure.

Possibly related PRs

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main changes: backing up update-flagged outputs and restoring them on job failures.
Description check ✅ Passed The PR description follows the template with both QC checklist items completed, confirming test coverage and documentation updates were addressed.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/restore-update-flagged

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 03973c3 and 0eca730.

📒 Files selected for processing (5)
  • src/snakemake/jobs.py (5 hunks)
  • tests/test_update_flag_fail/Snakefile (1 hunks)
  • tests/test_update_flag_fail/expected-results/test.txt (1 hunks)
  • tests/test_update_flag_fail/in.txt (1 hunks)
  • tests/test_update_flag_fail/test.txt (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • tests/test_update_flag_fail/test.txt
  • tests/test_update_flag_fail/expected-results/test.txt
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • src/snakemake/jobs.py
🧠 Learnings (9)
📚 Learning: 2024-10-06T14:09:54.370Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: tests/test_wrapper/Snakefile:11-11
Timestamp: 2024-10-06T14:09:54.370Z
Learning: Changes made within test cases, such as in `tests/test_wrapper/Snakefile`, are for testing purposes and do not require updates to the project documentation.

Applied to files:

  • tests/test_update_flag_fail/Snakefile
📚 Learning: 2024-12-21T15:10:31.992Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 2925
File: tests/test_nonstr_params/Snakefile:14-15
Timestamp: 2024-12-21T15:10:31.992Z
Learning: The file "test.jsonl" in tests/test_nonstr_params is automatically created by Snakemake, rather than manually generated in the Snakefile.

Applied to files:

  • tests/test_update_flag_fail/Snakefile
📚 Learning: 2025-01-17T12:00:09.368Z
Learnt from: leoschwarz
Repo: snakemake/snakemake PR: 3176
File: tests/test_output_index.py:99-157
Timestamp: 2025-01-17T12:00:09.368Z
Learning: New test dependencies for Snakemake should be introduced in separate PRs rather than being added as part of feature or refactoring PRs.

Applied to files:

  • tests/test_update_flag_fail/Snakefile
📚 Learning: 2025-09-08T12:39:09.146Z
Learnt from: Hocnonsense
Repo: snakemake/snakemake PR: 3714
File: tests/tests.py:1637-1644
Timestamp: 2025-09-08T12:39:09.146Z
Learning: In Snakemake module tests, targeting log-only files like "aaalog" can be an intentional test pattern to validate module behavior and error handling, even though log files aren't typically producible targets in normal workflows.

Applied to files:

  • tests/test_update_flag_fail/Snakefile
📚 Learning: 2024-11-12T12:08:20.342Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3148
File: snakemake/dag.py:1332-1336
Timestamp: 2024-11-12T12:08:20.342Z
Learning: In `snakemake/dag.py`, when code is outdated and will disappear upon resolving merge conflicts, avoid making code review suggestions on that code.

Applied to files:

  • src/snakemake/jobs.py
📚 Learning: 2024-10-04T16:12:18.927Z
Learnt from: lczech
Repo: snakemake/snakemake PR: 3113
File: snakemake/scheduler.py:912-914
Timestamp: 2024-10-04T16:12:18.927Z
Learning: In `snakemake/scheduler.py`, avoid suggesting the use of `asyncio.gather` in the `jobs_rewards` method due to overhead concerns and the need for immediate results.

Applied to files:

  • src/snakemake/jobs.py
📚 Learning: 2024-11-05T09:41:56.469Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3184
File: snakemake/dag.py:1340-1342
Timestamp: 2024-11-05T09:41:56.469Z
Learning: The method `self.workflow.persistence.software_stack_changed(job)` in `snakemake/persistence.py` is not asynchronous.

Applied to files:

  • src/snakemake/jobs.py
📚 Learning: 2025-05-23T09:40:24.474Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3600
File: src/snakemake/jobs.py:960-964
Timestamp: 2025-05-23T09:40:24.474Z
Learning: In the `cleanup` method of the `Job` class in `src/snakemake/jobs.py`, files in the `to_remove` list should be formatted with `fmt_iofile` without specifying `as_output=True` or `as_input=True` parameters, as these files should be displayed as generic files rather than specifically as output files.

Applied to files:

  • src/snakemake/jobs.py
📚 Learning: 2025-05-06T01:37:23.382Z
Learnt from: kdm9
Repo: snakemake/snakemake PR: 3562
File: src/snakemake/checkpoints.py:90-90
Timestamp: 2025-05-06T01:37:23.382Z
Learning: In Snakemake checkpoints implementation, tracking only the first missing output for each checkpoint is sufficient, because if one output is missing, all outputs for that checkpoint are considered incomplete. This was the behavior before PR #3562 and maintained in the pluralized `checkpoints.get()` implementation.

Applied to files:

  • src/snakemake/jobs.py
🧬 Code graph analysis (1)
src/snakemake/jobs.py (5)
src/snakemake/workflow.py (6)
  • rules (610-611)
  • rules (775-776)
  • rule (1812-2080)
  • output (2096-2101)
  • persistence (516-517)
  • dag (520-521)
src/snakemake/rules.py (3)
  • Rule (78-1354)
  • output (315-316)
  • output (1404-1405)
src/snakemake/persistence.py (4)
  • rule (470-471)
  • backup_output (303-309)
  • restore_output (311-322)
  • cleanup_backup (324-330)
src/snakemake/io/__init__.py (2)
  • is_flagged (979-982)
  • exists (515-519)
src/snakemake/io/fmt.py (1)
  • fmt_iofile (5-50)
🪛 LanguageTool
tests/test_update_flag_fail/in.txt

[uncategorized] ~1-~1: "sedem"
Context: Fri Nov 7 12:41:58 PM CET 2025

(STEVILA_DO_10)


[uncategorized] ~1-~1: Zamenjajte z ustrezno ubeseditvijo
Context: Fri Nov 7 12:41:58 PM CET 2025

(STEVILA_DO_100)


[uncategorized] ~1-~1: Zamenjajte z ustrezno ubeseditvijo
Context: Fri Nov 7 12:41:58 PM CET 2025

(STEVILA_DO_100)


[uncategorized] ~1-~1: Zamenjajte z ustrezno ubeseditvijo
Context: Fri Nov 7 12:41:58 PM CET 2025

(STEVILA_DO_100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (39)
  • GitHub Check: tests (10, windows-2022, py313)
  • GitHub Check: tests (10, ubuntu-latest, py312)
  • GitHub Check: tests (9, ubuntu-latest, py312)
  • GitHub Check: tests (10, ubuntu-latest, py311)
  • GitHub Check: tests (10, ubuntu-latest, py313)
  • GitHub Check: tests (7, ubuntu-latest, py313)
  • GitHub Check: tests (9, ubuntu-latest, py313)
  • GitHub Check: tests (7, macos-latest, py313)
  • GitHub Check: tests (6, macos-latest, py313)
  • GitHub Check: tests (6, ubuntu-latest, py313)
  • GitHub Check: tests (8, windows-2022, py313)
  • GitHub Check: tests (9, ubuntu-latest, py311)
  • GitHub Check: tests (8, ubuntu-latest, py313)
  • GitHub Check: tests (7, windows-2022, py313)
  • GitHub Check: tests (7, ubuntu-latest, py312)
  • GitHub Check: tests (6, ubuntu-latest, py312)
  • GitHub Check: tests (5, macos-latest, py313)
  • GitHub Check: tests (5, ubuntu-latest, py313)
  • GitHub Check: tests (7, ubuntu-latest, py311)
  • GitHub Check: tests (4, ubuntu-latest, py313)
  • GitHub Check: tests (6, windows-2022, py313)
  • GitHub Check: tests (4, ubuntu-latest, py312)
  • GitHub Check: tests (2, windows-2022, py313)
  • GitHub Check: tests (2, macos-latest, py313)
  • GitHub Check: tests (5, windows-2022, py313)
  • GitHub Check: tests (5, ubuntu-latest, py312)
  • GitHub Check: tests (3, windows-2022, py313)
  • GitHub Check: tests (2, ubuntu-latest, py312)
  • GitHub Check: tests (3, ubuntu-latest, py312)
  • GitHub Check: tests (5, ubuntu-latest, py311)
  • GitHub Check: tests (3, ubuntu-latest, py313)
  • GitHub Check: tests (4, ubuntu-latest, py311)
  • GitHub Check: tests (3, macos-latest, py313)
  • GitHub Check: tests (1, windows-2022, py313)
  • GitHub Check: tests (3, ubuntu-latest, py311)
  • GitHub Check: tests (1, ubuntu-latest, py311)
  • GitHub Check: tests (2, ubuntu-latest, py313)
  • GitHub Check: tests (1, ubuntu-latest, py313)
  • GitHub Check: tests (2, ubuntu-latest, py311)
🔇 Additional comments (6)
tests/test_update_flag_fail/Snakefile (2)

1-1: Test setup looks appropriate.

Creating the input file dynamically at workflow start is a valid pattern for testing.


4-10: Test case correctly exercises update flag failure scenario.

The intentional failure (exit 1) allows testing that backups are properly restored when jobs with update-flagged outputs fail.

src/snakemake/jobs.py (4)

23-23: Import looks good.

Adding the Rule import enables proper type annotation on line 233.


233-233: Type annotation improves code clarity.

Explicitly annotating the rule attribute type enhances IDE support and type checking.


1237-1245: Restore/cleanup logic is sound but depends on backup fix.

The conditional logic correctly restores backups on error and cleans them up on success. However, restore_output will encounter the same storage-file issue noted at lines 841-844. Once the backup logic is fixed to handle storage-backed files, this restore/cleanup code will work correctly.

Verify that the fix for storage-backed backups (lines 841-844) also ensures restore operations work correctly.


841-844: Storage-backed update outputs will fail—unresolved from previous review.

As flagged in the previous review, backup_output(Path(f)) uses shutil.copy which requires a local file. If an update-flagged output is storage-backed and not present locally, the backup will fail and later restore_output will raise "no backup found."

You should either:

  1. Validate during DAG/rule setup that update-flagged outputs cannot be storage-backed, raising a clear error if this constraint is violated, or
  2. Retrieve storage files before backup using the async TaskGroup pattern (as used elsewhere in this file around line 790-791):
# Backup update-flagged outputs (handle storage-backed files)
async with asyncio.TaskGroup() as tg:
    for f in self.output:
        if is_flagged(f, "update"):
            async def backup_with_retrieval(f):
                if f.is_storage and not await f.exists_local():
                    await f.retrieve_from_storage()
                self.rule.workflow.persistence.backup_output(Path(f))
            tg.create_task(backup_with_retrieval(f))

for f in self.output:
    f.prepare()

Which approach do you prefer? If update+storage is not intended, option 1 is simpler.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
docs/snakefiles/rules.rst (1)

3374-3376: Clarify update()+remote storage and large directory caveats.

The new guarantees look good. Recommend adding one sentence that update() applies to local files/dirs, or that Snakemake will retrieve remote copies before backing up when a default storage provider is used. Also note that backing up large directories can be expensive and suggest using more granular outputs when practical.

src/snakemake/jobs.py (1)

23-23: Avoid runtime import for type hint to prevent cycles.

Use TYPE_CHECKING + forward reference to keep imports light and cycle-safe.

-from snakemake.rules import Rule
+from typing import TYPE_CHECKING
+if TYPE_CHECKING:
+    from snakemake.rules import Rule
@@
-        self.rule: Rule = rule
+        self.rule: "Rule" = rule

Also applies to: 233-233

src/snakemake/persistence.py (1)

311-322: Consider consistent handling of existing paths.

For directories, the code explicitly removes the existing path before restoring, but for files, it relies on shutil.copy to overwrite. While functionally correct, this inconsistency could cause issues if path exists as the wrong type (e.g., a directory when a file is expected).

Apply this diff for more robust handling:

 def restore_output(self, path: Path) -> None:
     backup_path = self._get_backup_path(path)
     if not backup_path.exists():
         raise WorkflowError(f"Cannot restore {path}: no backup found.")
+    # Remove existing path regardless of type for consistency
+    if path.exists():
+        if path.is_dir():
+            shutil.rmtree(path)
+        else:
+            path.unlink()
     if backup_path.is_dir():
-        if path.exists():
-            shutil.rmtree(path)
         shutil.copytree(backup_path, path)
         shutil.rmtree(backup_path)
     else:
         shutil.copy(backup_path, path)
         backup_path.unlink()
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 804be9f and 03973c3.

📒 Files selected for processing (4)
  • docs/snakefiles/rules.rst (1 hunks)
  • src/snakemake/jobs.py (5 hunks)
  • src/snakemake/persistence.py (2 hunks)
  • tests/tests.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • tests/tests.py
  • src/snakemake/jobs.py
  • src/snakemake/persistence.py
🧠 Learnings (3)
📚 Learning: 2024-10-06T14:09:54.370Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: tests/test_wrapper/Snakefile:11-11
Timestamp: 2024-10-06T14:09:54.370Z
Learning: Changes made within test cases, such as in `tests/test_wrapper/Snakefile`, are for testing purposes and do not require updates to the project documentation.

Applied to files:

  • docs/snakefiles/rules.rst
  • tests/tests.py
📚 Learning: 2024-12-14T13:10:48.450Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 2927
File: tests/test_subpath/Snakefile:11-11
Timestamp: 2024-12-14T13:10:48.450Z
Learning: In Snakemake, when a rule defines a single output file (even if the output is specified as a dictionary), `{output}` in the shell command can be used directly to refer to that file.

Applied to files:

  • docs/snakefiles/rules.rst
📚 Learning: 2025-05-23T09:40:24.474Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3600
File: src/snakemake/jobs.py:960-964
Timestamp: 2025-05-23T09:40:24.474Z
Learning: In the `cleanup` method of the `Job` class in `src/snakemake/jobs.py`, files in the `to_remove` list should be formatted with `fmt_iofile` without specifying `as_output=True` or `as_input=True` parameters, as these files should be displayed as generic files rather than specifically as output files.

Applied to files:

  • src/snakemake/jobs.py
🧬 Code graph analysis (3)
tests/tests.py (1)
tests/common.py (2)
  • run (152-503)
  • dpath (33-36)
src/snakemake/jobs.py (5)
src/snakemake/workflow.py (6)
  • rules (610-611)
  • rules (775-776)
  • rule (1812-2080)
  • output (2096-2101)
  • persistence (516-517)
  • dag (520-521)
src/snakemake/rules.py (3)
  • Rule (78-1354)
  • output (315-316)
  • output (1404-1405)
src/snakemake/persistence.py (4)
  • rule (470-471)
  • backup_output (303-309)
  • restore_output (311-322)
  • cleanup_backup (324-330)
src/snakemake/io/__init__.py (2)
  • is_flagged (979-982)
  • exists (515-519)
src/snakemake/io/fmt.py (1)
  • fmt_iofile (5-50)
src/snakemake/persistence.py (1)
tests/common.py (1)
  • copy (117-121)
🪛 Ruff (0.14.3)
src/snakemake/persistence.py

314-314: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: tests (10, windows-2022, py313)
  • GitHub Check: tests (10, macos-latest, py313)
  • GitHub Check: tests (2, windows-2022, py313)
  • GitHub Check: tests (8, macos-latest, py313)
  • GitHub Check: tests (7, ubuntu-latest, py312)
  • GitHub Check: tests (7, ubuntu-latest, py313)
  • GitHub Check: tests (7, ubuntu-latest, py311)
  • GitHub Check: tests (2, ubuntu-latest, py313)
  • GitHub Check: tests (4, ubuntu-latest, py312)
  • GitHub Check: tests (4, ubuntu-latest, py313)
  • GitHub Check: tests (3, macos-latest, py313)
  • GitHub Check: tests (4, ubuntu-latest, py311)
  • GitHub Check: tests (3, windows-2022, py313)
  • GitHub Check: tests (2, ubuntu-latest, py311)
  • GitHub Check: tests (2, ubuntu-latest, py312)
🔇 Additional comments (5)
tests/tests.py (1)

2304-2306: Mirror Windows skip to match test_update_flag.

Given test_update_flag is skipped on Windows, consider applying @skip_on_windows here for consistency unless you verified it’s stable on Windows.

Apply if needed:

 @skip_on_windows  # OS agnostic
 def test_update_flag():
     run(dpath("test_update_flag"))
 
+@skip_on_windows
 def test_update_flag_fail():
     run(dpath("test_update_flag_fail"), shouldfail=True, check_results=True)
src/snakemake/jobs.py (1)

1235-1242: LGTM: restore on error, cleanup backup on success.

The update backup lifecycle is correct and runs before touch/log handling. Nice.

Edge case: if backup is missing (e.g., storage-only output not retrieved), restore_output raises; see suggested guard in prepare().

src/snakemake/persistence.py (3)

73-73: LGTM!

The backup path initialization is clean and follows Python path conventions.


324-330: LGTM!

The cleanup logic correctly handles both files and directories, with appropriate existence checks.


332-336: LGTM! Clever handling of absolute paths.

The logic correctly converts absolute paths to relative by stripping the root using path.relative_to(path.parents[-1]). The Path class handles cross-platform differences (Unix / vs Windows C:\), so this approach should work correctly on all platforms.

Comment thread src/snakemake/jobs.py
Comment on lines +841 to 845
for f in self.output:
if is_flagged(f, "update"):
self.rule.workflow.persistence.backup_output(Path(f))
f.prepare()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Handle update()+storage: retrieve local copy before backup.

If an update-marked output lives in a storage provider and isn’t present locally, backup_output(Path(f)) will fail, and later restore_output will raise “no backup found.” Retrieve first.

-            for f in self.output:
-                if is_flagged(f, "update"):
-                    self.rule.workflow.persistence.backup_output(Path(f))
+            for f in self.output:
+                if is_flagged(f, "update"):
+                    # Ensure local copy exists for storage-backed outputs
+                    if f.is_storage and not await f.exists_local():
+                        await f.retrieve_from_storage()
+                    self.rule.workflow.persistence.backup_output(Path(f))
                 f.prepare()

Please confirm whether update() is intended to support storage outputs; if not, consider enforcing a guard that disallows update+storage with a clear error.



---

🏁 Script executed:

```shell
#!/bin/bash
# Get context around lines 841-845 in jobs.py to see the containing function
sed -n '820,860p' src/snakemake/jobs.py

Length of output: 1453


🏁 Script executed:

#!/bin/bash
# Find the containing function name and its async/sync status
sed -n '800,850p' src/snakemake/jobs.py | grep -B 50 "for f in self.output" | grep -E "^\s*async\s+def|^\s*def " | tail -1

Length of output: 28


🏁 Script executed:

#!/bin/bash
# Search for backup_output and restore_output implementations
rg -n "def backup_output|def restore_output" --type=py

Length of output: 164


🏁 Script executed:

#!/bin/bash
# Search for the output class definition and storage-related attributes
rg -n "class.*Output|is_storage|retrieve_from_storage" --type=py -C 2 | head -100

Length of output: 5794


🏁 Script executed:

#!/bin/bash
# Search for existing update flag handling with storage
rg -n "is_flagged.*update" --type=py -C 3

Length of output: 4369



---

🏁 Script executed:

```shell
#!/bin/bash
# Check backup_output and restore_output implementations
sed -n '303,330p' src/snakemake/persistence.py

Length of output: 1048


🏁 Script executed:

#!/bin/bash
# Find f.prepare() method - search for prepare() in output-related files
rg -n "def prepare\(" --type=py src/snakemake/io/ -A 5

Length of output: 396


🏁 Script executed:

#!/bin/bash
# Check if there are any async loops or taskgroups used for output handling
rg -n "for f in.*output|for f in self.output" --type=py src/snakemake/jobs.py -A 10 | head -80

Length of output: 3074


🏁 Script executed:

#!/bin/bash
# Search for any error messages or issues related to backup_output with storage
rg -n "backup_output|restore_output" --type=py -B 2 -A 5

Length of output: 2343



---

🏁 Script executed:

```shell
#!/bin/bash
# Search for test cases or examples with update flag on storage outputs
rg -n "update.*storage|storage.*update" --type=py tests/

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Search for any validation or guards preventing update + storage combination
rg -n "update" --type=py src/snakemake/rules.py -B 2 -A 2 | head -60

Length of output: 2297


🏁 Script executed:

#!/bin/bash
# Check if there's any special handling in DAG for update-flagged files
rg -n "update_jobs|is_flagged.*update" --type=py src/snakemake/dag.py -B 1 -A 3 | head -80

Length of output: 1424


🏁 Script executed:

#!/bin/bash
# Check what Path(f) actually does - is f converting to local path or storage path?
rg -n "def __str__|def __fspath__" --type=py src/snakemake/io/ -A 3

Length of output: 434


🏁 Script executed:

#!/bin/bash
# Look for any existing patterns of checking storage before backup in the codebase
rg -n "is_storage.*backup|backup.*is_storage" --type=py -B 2 -A 2

Length of output: 0



---

🏁 Script executed:

```shell
#!/bin/bash
# Check the IOFile class __str__ and how Path(f) resolves
sed -n '1900,1930p' src/snakemake/io/__init__.py

Length of output: 951


🏁 Script executed:

#!/bin/bash
# Check if storage files are retrieved/localized before prepare() is called
rg -n "await.*retrieve|keep_storage_local" --type=py src/snakemake/jobs.py -B 2 -A 2 | head -100

Length of output: 860


🏁 Script executed:

#!/bin/bash
# Check the full prepare method to understand the execution flow
sed -n '798,860p' src/snakemake/jobs.py

Length of output: 2273


🏁 Script executed:

#!/bin/bash
# Search for any comments or documentation about update flag and storage
rg -n "update.*flag|update.*marker" --type=py src/snakemake/ | grep -i "doc\|comment\|#" | head -20

Length of output: 113


🏁 Script executed:

#!/bin/bash
# Check how files enter this prepare() - are they already local?
sed -n '750,800p' src/snakemake/jobs.py

Length of output: 1476


Storage-backed update outputs will fail at backup/restore—needs fix with proper async handling.

The concern is valid: backup_output(Path(f)) calls shutil.copy which requires a local filesystem path, but storage-backed files may not be present locally. For update-flagged storage outputs, this will fail, and restore_output will raise "no backup found."

However, the proposed diff won't work as-is—it uses await in a synchronous loop. Since prepare() is async (line 813), the fix should use the existing TaskGroup pattern (as seen at line 790-791 in the same file):

            if not self.is_norun:
                await self.remove_existing_output()

                if self.resources.get("tmpdir"):
                    os.makedirs(self.resources.tmpdir, exist_ok=True)

+               # Backup storage-backed update outputs before prepare
+               async with asyncio.TaskGroup() as tg:
+                   for f in self.output:
+                       if is_flagged(f, "update") and f.is_storage:
+                           async def backup_storage_output(f):
+                               if not await f.exists_local():
+                                   await f.retrieve_from_storage()
+                               self.rule.workflow.persistence.backup_output(Path(f))
+                           tg.create_task(backup_storage_output(f))
+
                for f in self.output:
                    if is_flagged(f, "update"):
-                       self.rule.workflow.persistence.backup_output(Path(f))
                    f.prepare()

Alternatively, consider whether update is intended to support storage outputs; if not, add a validation check in the DAG or rule setup.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/snakemake/jobs.py around lines 841-845, the loop calls synchronous
backup_output for update-flagged outputs which will fail for storage-backed
files and the proposed diff used await inside a sync loop; fix by either
validating that update-flagged outputs are local during DAG/rule setup and raise
a clear error if not, or implement async backups using the existing TaskGroup
pattern: collect update-flagged outputs, open a TaskGroup, spawn tasks that
await self.rule.workflow.persistence.backup_output(Path(f)) for storage-backed
outputs, await completion of the group, then call await f.prepare() (or schedule
prepare() tasks in the same TaskGroup if needed) so all async work runs
correctly without using await in a plain for-loop.

Comment thread src/snakemake/jobs.py Outdated
Comment on lines +303 to +309
def backup_output(self, path: Path) -> None:
backup_path = self._get_backup_path(path)
backup_path.parent.mkdir(parents=True, exist_ok=True)
if path.is_dir():
shutil.copytree(path, backup_path)
else:
shutil.copy(path, backup_path)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Add existence check and handle pre-existing backups.

The method has two potential issues:

  1. If path doesn't exist, path.is_dir() returns False, causing shutil.copy to fail with a FileNotFoundError.
  2. For directories, shutil.copytree will raise FileExistsError if backup_path already exists (e.g., from a previous failed backup that wasn't cleaned up).

Apply this diff to add validation and handle existing backups:

 def backup_output(self, path: Path) -> None:
+    if not path.exists():
+        raise WorkflowError(f"Cannot backup {path}: file does not exist.")
     backup_path = self._get_backup_path(path)
     backup_path.parent.mkdir(parents=True, exist_ok=True)
+    # Clean up any existing backup first
+    if backup_path.exists():
+        if backup_path.is_dir():
+            shutil.rmtree(backup_path)
+        else:
+            backup_path.unlink()
     if path.is_dir():
         shutil.copytree(path, backup_path)
     else:
         shutil.copy(path, backup_path)

Run the following script to verify that outputs marked with the "update" flag exist before backup_output is called:


🏁 Script executed:

#!/bin/bash
# Description: Verify that backup_output is only called on existing files

# Search for calls to backup_output in the codebase
rg -nP -C5 'backup_output\s*\(' --type=py

Length of output: 1528


[/run_scripts]


🏁 Script executed:

#!/bin/bash
# Search for is_flagged function and update flag handling
rg -nP 'is_flagged|update.*flag' --type=py -A2 -B2 | head -100

Length of output: 5283


🏁 Script executed:

#!/bin/bash
# Search for prepare() method on output files
rg -nP 'def prepare\s*\(' --type=py -A5 -B1 | head -150

Length of output: 848


🏁 Script executed:

#!/bin/bash
# Search for how backup_output is used in context of file existence
rg -nP '\.backup_output\(' --type=py -B10 -A5

Length of output: 919


[/run_scripts]


🏁 Script executed:

#!/bin/bash
# Search for restore/revert logic for update-flagged files
rg -nP 'Restoring previous version|update.*restore|backup.*restore' --type=py -B5 -A10

Length of output: 1206


🏁 Script executed:

#!/bin/bash
# Look at the restore logic more carefully
rg -nP 'if error:|if is_flagged.*update' --type=py -B3 -A15 | grep -A20 'if is_flagged.*update'

Length of output: 7792


🏁 Script executed:

#!/bin/bash
# Check if there's any file existence check before backup_output
rg -nP '(if|and).*\.exists\(\).*backup_output|backup_output.*exists' --type=py -B5 -A5

Length of output: 0


[/run_scripts]


🏁 Script executed:

#!/bin/bash
# Search for restore_output and cleanup_backup methods in persistence.py
rg -nP 'def (restore_output|cleanup_backup)' --type=py -A15

Length of output: 2021


🏁 Script executed:

#!/bin/bash
# Search for _get_backup_path to understand backup path structure
rg -nP 'def _get_backup_path' --type=py -A8

Length of output: 690


Handle missing output files when backing up "update"-flagged outputs.

The method does not check whether path exists before calling path.is_dir(). On the first run of a rule with the "update" flag, the output file doesn't exist yet, causing path.is_dir() to return False and shutil.copy() to fail with FileNotFoundError.

Additionally, if a previous backup exists (e.g., from a crashed job), shutil.copytree() will raise FileExistsError without the dirs_exist_ok parameter.

Apply this diff to validate file existence and handle pre-existing backups:

 def backup_output(self, path: Path) -> None:
+    if not path.exists():
+        return  # No existing file to backup on first run
     backup_path = self._get_backup_path(path)
     backup_path.parent.mkdir(parents=True, exist_ok=True)
+    # Clean up any existing backup first (edge case: crashed job)
+    if backup_path.exists():
+        if backup_path.is_dir():
+            shutil.rmtree(backup_path)
+        else:
+            backup_path.unlink()
     if path.is_dir():
         shutil.copytree(path, backup_path)
     else:
         shutil.copy(path, backup_path)
🤖 Prompt for AI Agents
In src/snakemake/persistence.py around lines 303 to 309, the backup_output
method calls path.is_dir() and shutil.copy/copytree without verifying the source
exists and without handling pre-existing backups; first check if path.exists()
and return early (no-op) if it does not exist; when copying directories use
shutil.copytree(path, backup_path, dirs_exist_ok=True) to allow overwriting
existing backup directories (or alternatively remove existing backup_path before
copytree on older Python versions); keep the existing parent.mkdir(parents=True,
exist_ok=True) for the backup path.

Comment thread src/snakemake/persistence.py
@johanneskoester johanneskoester merged commit 3e0904d into main Nov 7, 2025
60 checks passed
@johanneskoester johanneskoester deleted the fix/restore-update-flagged branch November 7, 2025 15:47
johanneskoester pushed a commit that referenced this pull request Nov 7, 2025
🤖 I have created a release *beep* *boop*
---


##
[9.13.7](v9.13.6...v9.13.7)
(2025-11-07)


### Bug Fixes

* backup output files marked for update and restore them in case of job
failures ([#3833](#3833))
([3e0904d](3e0904d))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
kjohnsen pushed a commit to kjohnsen/snakemake that referenced this pull request Dec 15, 2025
…f job failures (snakemake#3833)

### Description

<!--Add a description of your PR here-->

### QC
<!-- Make sure that you can tick the boxes below. -->

* [x] The PR contains a test case for the changes or the changes are
already covered by an existing test case.
* [x] The documentation (`docs/`) is updated to reflect the changes or
this is not necessary (e.g. if the change does neither modify the
language nor the behavior or functionalities of Snakemake).


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Clarified update flag: files/directories are not removed before
execution and previous versions are restored if a job fails.

* **New Features**
* Added backup/restore handling for outputs marked with "update" to
preserve and recover prior content on failure.

* **Bug Fixes**
* Changed cleanup behavior to avoid removing outputs flagged as updates
prematurely.

* **Tests**
  * Added a test covering a failing update-flag scenario.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
kjohnsen pushed a commit to kjohnsen/snakemake that referenced this pull request Dec 15, 2025
🤖 I have created a release *beep* *boop*
---


##
[9.13.7](snakemake/snakemake@v9.13.6...v9.13.7)
(2025-11-07)


### Bug Fixes

* backup output files marked for update and restore them in case of job
failures ([snakemake#3833](snakemake#3833))
([3e0904d](snakemake@3e0904d))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant