Skip to content

perf: switch reretry with tenacity; decouple container classes (with Python 3.7 compat for old scripts) from rest of the codebase (enabling moving to newer python versions)#4032

Merged
johanneskoester merged 32 commits intosnakemake:mainfrom
fgvieira:retry
Mar 11, 2026

Conversation

@fgvieira
Copy link
Copy Markdown
Contributor

@fgvieira fgvieira commented Mar 10, 2026

Fixes #3959

QC

  • The PR contains a test case for the changes or the changes are already covered by an existing test case.
  • The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

Summary by CodeRabbit

  • Refactor
    • Improved resilience of repository fetches by adding automatic retries with exponential backoff to reduce failures from transient network issues.
    • Consolidated file-opening and caching into a single, unified, retry-enabled flow so local and remote sources are accessed more reliably and consistently.
    • Preserved explicit handling for binary and text reads to maintain correct encoding and data integrity.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds tenacity-based retrying to network/git operations in src/snakemake/sourcecache.py: introduces a retry-decorated SourceCache._open and retry-decorated HostedGitRepo.fetch, removes _open_local_or_remote, and funnels open/cache flows through the centralized _open.

Changes

Cohort / File(s) Summary
Source cache & git fetch retry
src/snakemake/sourcecache.py
Replaced ad-hoc retrying with tenacity: added @retry on HostedGitRepo.fetch and a new SourceCache._open(source_file, mode, encoding=None) decorated with exponential backoff/stop/after_log; removed _open_local_or_remote; consolidated open, _cache, and _do_cache to call _open; adjusted binary/text encoding handling; added tenacity imports (retry, stop_after_attempt, wait_exponential, after_log).

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant SourceCache
    participant LocalFS
    participant HostedGitRepo
    participant CacheStore

    Caller->>SourceCache: open(source_file, mode)
    SourceCache->>LocalFS: check cache/local for file
    alt file found locally
        LocalFS-->>SourceCache: file handle/stream
        SourceCache-->>Caller: return handle
    else file not present locally
        SourceCache->>HostedGitRepo: fetch(source_file) (tenacity retry)
        HostedGitRepo-->>SourceCache: file stream
        SourceCache->>CacheStore: write cached copy
        CacheStore-->>SourceCache: cached entry
        SourceCache-->>Caller: return handle
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (1 warning, 2 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description references the linked issue #3959 but provides no implementation details, rationale, or explanation of the changes made. The QC checklist boxes are unchecked with no reasoning provided. Add details about why tenacity was chosen, how it resolves the log parsing issues mentioned in #3959, and explain the status of test coverage and documentation updates.
Title check ❓ Inconclusive The title mentions switching from reretry to tenacity and decoupling container classes, but the changeset only shows reretry replacement in one file with limited scope. The second part about Python 3.7 compatibility and decoupling is not evident in the actual changes provided. Clarify whether the PR scope includes broader decoupling work beyond the reretry-to-tenacity switch, or consider making the title focus on the primary change: replacing reretry with tenacity in sourcecache.py.
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed The code changes successfully implement the core objective from #3959: replacing reretry with tenacity for retry handling with proper exponential backoff and logging configuration.
Out of Scope Changes check ✅ Passed All changes are directly related to replacing reretry with tenacity in the sourcecache module, including retry decorators, centralized retry logic, and logging integration. No unrelated modifications were detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/snakemake/sourcecache.py`:
- Around line 384-389: The HostedGitRepo.fetch method currently builds a
callable with partial(self.repo.remotes.origin.fetch,
"+refs/heads/*:refs/heads/*") but never executes it and still sets self._fetched
= True; replace the partial with an actual call to
self.repo.remotes.origin.fetch("+refs/heads/*:refs/heads/*") so the fetch runs,
remove the inner try/except that returns str(e) (do not swallow exceptions) so
tenacity can retry and propagate failures, and only set self._fetched = True
after a successful fetch; if callers require Optional[str], handle/convert the
final exception one level up instead of catching inside HostedGitRepo.fetch.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c23199c3-295f-4b40-a462-15c6173fbbf8

📥 Commits

Reviewing files that changed from the base of the PR and between b88171c and b7b51e4.

⛔ Files ignored due to path filters (1)
  • pyproject.toml is excluded by !pyproject.toml
📒 Files selected for processing (1)
  • src/snakemake/sourcecache.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
src/snakemake/sourcecache.py (3)

383-389: ⚠️ Potential issue | 🔴 Critical

partial(...) never executes the fetch; exception handling defeats retry logic.

The partial(self.repo.remotes.origin.fetch, ...) on line 386 only constructs a callable—it never invokes it (missing ()), so the fetch never runs yet _fetched = True is set unconditionally.

Additionally, catching GitCommandError and returning str(e) means tenacity sees the method succeed (returns a string) and will never retry on transient failures.

Proposed fix
         try:
-            # arg is needed in order to have a refspec, associated issue with
-            # workaround is here: https://github.com/gitpython-developers/GitPython/issues/296
-            partial(self.repo.remotes.origin.fetch, "+refs/heads/*:refs/heads/*")
+            # refspec arg needed per GitPython issue `#296`
+            self.repo.remotes.origin.fetch("+refs/heads/*:refs/heads/*")
         except git.GitCommandError as e:
-            return str(e)
+            raise  # Let tenacity handle retry
         self._fetched = True

If callers rely on the Optional[str] return contract, handle the final exception one level up (e.g., in fetch_if_required) after retries are exhausted.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/sourcecache.py` around lines 383 - 389, The code constructs a
callable with partial(self.repo.remotes.origin.fetch,
"+refs/heads/*:refs/heads/*") but never invokes it and then unconditionally sets
self._fetched and returns on GitCommandError, which defeats retry logic; change
this so you actually call the fetch (invoke the partial or call
self.repo.remotes.origin.fetch with the refspec), do not catch
git.GitCommandError here so tenacity can retry, and only set self._fetched =
True after a successful fetch; preserve the Optional[str] contract by
propagating or re-raising the final exception to be handled in fetch_if_required
(or return an error string there) rather than returning str(e) from this method.

766-771: ⚠️ Potential issue | 🔴 Critical

Decorator references undefined retries—breaks module import.

@retry(..., stop=stop_after_attempt(retries), ...) is evaluated at class definition time, but retries doesn't exist in module scope—it was presumably intended to be a parameter. This causes the NameError shown in the pipeline failure. Additionally, logging.DEBUG requires importing logging.

If a fixed retry count is acceptable, hardcode it. If per-call configuration is needed, use tenacity's runtime API instead:

Option 1: Hardcoded retry count (simpler)
+import logging
 from tenacity import retry, stop_after_attempt, wait_exponential, after_log
...
 `@retry`(
     wait=wait_exponential(multiplier=2, min=3),
-    stop=stop_after_attempt(retries),
+    stop=stop_after_attempt(3),
     after=after_log(logger, logging.DEBUG),
 )
 def _open(self, source_file: SourceFile, mode, encoding=None):
Option 2: Runtime retry for per-call configuration
from tenacity import Retrying, stop_after_attempt, wait_exponential

def _open(self, source_file: SourceFile, mode, encoding=None, retries: int = 3):
    for attempt in Retrying(
        wait=wait_exponential(multiplier=2, min=3),
        stop=stop_after_attempt(retries),
    ):
        with attempt:
            return self._do_open(source_file, mode, encoding)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/sourcecache.py` around lines 766 - 771, The decorator on _open
references an undefined retries and also uses logging without importing it; fix
by either (A) hardcoding a retry count in the decorator (e.g., replace
stop=stop_after_attempt(retries) with stop=stop_after_attempt(3)) and add
"import logging" to the module, or (B) remove the decorator and implement
runtime retries using tenacity.Retrying/stop_after_attempt with a retries
parameter (e.g., implement a loop in _open that calls self._do_open inside a
Retrying context); locate the tenacity decorator on the _open definition and the
_open function itself to apply the change. Ensure logging.DEBUG is available by
importing logging.

14-14: ⚠️ Potential issue | 🔴 Critical

Missing logging module import.

The after_log callback at line 769 uses logging.DEBUG, but the logging module is never imported. Only logger from snakemake.logging is available.

Proposed fix
+import logging
 from tenacity import retry, stop_after_attempt, wait_exponential, after_log
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/sourcecache.py` at line 14, The code uses after_log with
logging.DEBUG but never imports the logging module; add an import for the
standard logging library at the top of the file so the symbol logging.DEBUG is
defined (e.g., add "import logging" alongside other imports in sourcecache.py),
ensuring the retry(..., after=after_log(logger, logging.DEBUG)) call works
correctly with the logging module available.
🧹 Nitpick comments (1)
src/snakemake/sourcecache.py (1)

766-770: Consider reraise=True to preserve original exception type for callers.

When retries are exhausted, tenacity raises RetryError by default, wrapping the last exception. Callers like report/__init__.py catch generic Exception and wrap in WorkflowError, but the original exception context may be obscured. Adding reraise=True to the decorator will re-raise the original exception instead:

`@retry`(
    wait=...,
    stop=...,
    reraise=True,  # Re-raise original exception after retry exhaustion
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/sourcecache.py` around lines 766 - 770, The retry decorator on
the function in sourcecache.py currently lacks reraise=True, so when retries
exhaust tenacity wraps the last exception in RetryError; update the `@retry`(...)
decorator (the one using wait_exponential, stop_after_attempt and after_log with
logger) to include reraise=True so the original exception type is re-raised to
callers (preserving exception context for callers like report/__init__.py).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/snakemake/sourcecache.py`:
- Around line 749-752: The _do_cache function declares a retries parameter that
is never used; either remove the unused retries argument from _do_cache or
implement per-call retrying by replacing the decorator-based retry on _open with
a runtime retry (e.g., use tenacity.Retrying or equivalent) and apply it around
the _open call inside _do_cache so you can pass the retries value; update the
signature of _do_cache (and any callers) if you remove the parameter, or wrap
the self._open invocation in a Retrying(...) block that uses the retries
argument to control attempts.

---

Duplicate comments:
In `@src/snakemake/sourcecache.py`:
- Around line 383-389: The code constructs a callable with
partial(self.repo.remotes.origin.fetch, "+refs/heads/*:refs/heads/*") but never
invokes it and then unconditionally sets self._fetched and returns on
GitCommandError, which defeats retry logic; change this so you actually call the
fetch (invoke the partial or call self.repo.remotes.origin.fetch with the
refspec), do not catch git.GitCommandError here so tenacity can retry, and only
set self._fetched = True after a successful fetch; preserve the Optional[str]
contract by propagating or re-raising the final exception to be handled in
fetch_if_required (or return an error string there) rather than returning str(e)
from this method.
- Around line 766-771: The decorator on _open references an undefined retries
and also uses logging without importing it; fix by either (A) hardcoding a retry
count in the decorator (e.g., replace stop=stop_after_attempt(retries) with
stop=stop_after_attempt(3)) and add "import logging" to the module, or (B)
remove the decorator and implement runtime retries using
tenacity.Retrying/stop_after_attempt with a retries parameter (e.g., implement a
loop in _open that calls self._do_open inside a Retrying context); locate the
tenacity decorator on the _open definition and the _open function itself to
apply the change. Ensure logging.DEBUG is available by importing logging.
- Line 14: The code uses after_log with logging.DEBUG but never imports the
logging module; add an import for the standard logging library at the top of the
file so the symbol logging.DEBUG is defined (e.g., add "import logging"
alongside other imports in sourcecache.py), ensuring the retry(...,
after=after_log(logger, logging.DEBUG)) call works correctly with the logging
module available.

---

Nitpick comments:
In `@src/snakemake/sourcecache.py`:
- Around line 766-770: The retry decorator on the function in sourcecache.py
currently lacks reraise=True, so when retries exhaust tenacity wraps the last
exception in RetryError; update the `@retry`(...) decorator (the one using
wait_exponential, stop_after_attempt and after_log with logger) to include
reraise=True so the original exception type is re-raised to callers (preserving
exception context for callers like report/__init__.py).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6ff6f75f-95e7-4f44-8850-315c6683f8b8

📥 Commits

Reviewing files that changed from the base of the PR and between b7b51e4 and b67d390.

📒 Files selected for processing (1)
  • src/snakemake/sourcecache.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
src/snakemake/sourcecache.py (2)

372-389: ⚠️ Potential issue | 🔴 Critical

Call origin.fetch() here and let tenacity see failures.

Lines 383-388 currently build a partial without executing it, and return str(e) hides the failure from tenacity. As written, existing clones stay stale and this retry wrapper never retries a failed fetch.

Minimal fix
-        try:
-            # arg is needed in order to have a refspec, associated issue with
-            # workaround is here: https://github.com/gitpython-developers/GitPython/issues/296
-            partial(self.repo.remotes.origin.fetch, "+refs/heads/*:refs/heads/*")
-        except git.GitCommandError as e:
-            return str(e)
+        # arg is needed in order to have a refspec, associated issue with
+        # workaround is here: https://github.com/gitpython-developers/GitPython/issues/296
+        self.repo.remotes.origin.fetch("+refs/heads/*:refs/heads/*")
         self._fetched = True

If fetch_if_required() still needs an Optional[str], convert the final exception one level up after the retry loop finishes.

#!/bin/bash
sed -n '372,389p' src/snakemake/sourcecache.py
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/sourcecache.py` around lines 372 - 389, The fetch method
currently constructs a functools.partial for self.repo.remotes.origin.fetch but
never calls it and catches GitCommandError returning its string, which prevents
tenacity from retrying; replace the partial usage by actually calling
self.repo.remotes.origin.fetch("+refs/heads/*:refs/heads/*") so exceptions
propagate to tenacity, remove the return str(e) in the except block (or re-raise
the exception) and only set self._fetched = True after a successful fetch so
failed fetches are retried by the `@retry` wrapper on fetch.

14-14: ⚠️ Potential issue | 🔴 Critical

Import logging before using logging.DEBUG.

Line 769 now references logging.DEBUG, but this module never imports logging. That makes sourcecache.py fail at import time with NameError.

Minimal fix
+import logging
 from tenacity import retry, stop_after_attempt, wait_exponential, after_log
#!/bin/bash
sed -n '1,25p' src/snakemake/sourcecache.py
sed -n '766,770p' src/snakemake/sourcecache.py
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/sourcecache.py` at line 14, The module uses logging.DEBUG
(referenced in the retry decorator or related code) but never imports the
logging module; add "import logging" alongside the other imports at the top of
src/snakemake/sourcecache.py (near the existing "from tenacity import retry,
stop_after_attempt, wait_exponential, after_log") so that logging.DEBUG is
defined when used (e.g., in the retry/after_log usage).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/snakemake/sourcecache.py`:
- Around line 766-770: The retry decorator on _open() currently uses
stop_after_attempt(1) and doesn't set reraise, which prevents retries and causes
tenacity.RetryError to be raised instead of the original WorkflowError; update
the decorator on _open() to allow multiple attempts (e.g.,
stop=stop_after_attempt(3) or a more appropriate retry count) and add
reraise=True to the retry(...) call so transient failures are retried and the
original exception from _open() is propagated to callers instead of RetryError.

---

Duplicate comments:
In `@src/snakemake/sourcecache.py`:
- Around line 372-389: The fetch method currently constructs a functools.partial
for self.repo.remotes.origin.fetch but never calls it and catches
GitCommandError returning its string, which prevents tenacity from retrying;
replace the partial usage by actually calling
self.repo.remotes.origin.fetch("+refs/heads/*:refs/heads/*") so exceptions
propagate to tenacity, remove the return str(e) in the except block (or re-raise
the exception) and only set self._fetched = True after a successful fetch so
failed fetches are retried by the `@retry` wrapper on fetch.
- Line 14: The module uses logging.DEBUG (referenced in the retry decorator or
related code) but never imports the logging module; add "import logging"
alongside the other imports at the top of src/snakemake/sourcecache.py (near the
existing "from tenacity import retry, stop_after_attempt, wait_exponential,
after_log") so that logging.DEBUG is defined when used (e.g., in the
retry/after_log usage).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 983e29a3-8f59-4ca9-a09b-902816b03a54

📥 Commits

Reviewing files that changed from the base of the PR and between b67d390 and 1d0472c.

📒 Files selected for processing (1)
  • src/snakemake/sourcecache.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
src/snakemake/sourcecache.py (2)

13-13: ⚠️ Potential issue | 🔴 Critical

Fix the _open() retry decorator before import.

Line 768 uses logger.INFO, but logger is a logging.Logger and has no INFO attribute, which matches the current pipeline failure. Also, stop_after_attempt(1) disables retries, and without reraise=True exhausted attempts surface as RetryError instead of the WorkflowError raised below.

Suggested fix
+import logging
 from tenacity import retry, stop_after_attempt, wait_exponential, after_log
@@
     `@retry`(
         wait=wait_exponential(multiplier=2, min=3),
-        stop=stop_after_attempt(1),
-        after=after_log(logger, logger.INFO),
+        stop=stop_after_attempt(3),
+        after=after_log(logger, logging.INFO),
+        reraise=True,
     )

Also applies to: 765-769

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/sourcecache.py` at line 13, The retry decorator applied to
_open() is incorrect: replace the non-existent logger.INFO with the proper
logging level constant (e.g., logging.INFO) or import the logging module and use
logging.INFO when calling after_log; change stop_after_attempt(1) to allow
retries (e.g., stop_after_attempt(3) or remove the single-attempt stopper) and
add reraise=True to the retry(...) args so that exhausted retries surface the
original WorkflowError instead of a tenacity.RetryError; update the decorator
import usage (retry, stop_after_attempt, wait_exponential, after_log) on the
_open function accordingly.

371-387: ⚠️ Potential issue | 🟠 Major

Let tenacity see fetch failures.

return str(e) turns a failed origin.fetch() into a successful call, so this decorator never retries and transient remote errors still fall through on the first attempt. If you still need the Optional[str] contract, convert the final failure to a message in fetch_if_required() after the retry loop, not inside fetch().

Minimal fix
     `@retry`(wait=wait_exponential(multiplier=2, min=3), stop=stop_after_attempt(3))
     def fetch(self) -> Optional[str]:
-        import git
-
         if self._fetched:
             # up to date, nothing to do
             return
@@
-        try:
-            # arg is needed in order to have a refspec, associated issue with
-            # workaround is here: https://github.com/gitpython-developers/GitPython/issues/296
-            self.repo.remotes.origin.fetch("+refs/heads/*:refs/heads/*")
-        except git.GitCommandError as e:
-            return str(e)
+        # arg is needed in order to have a refspec, associated issue with
+        # workaround is here: https://github.com/gitpython-developers/GitPython/issues/296
+        self.repo.remotes.origin.fetch("+refs/heads/*:refs/heads/*")
         self._fetched = True
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/sourcecache.py` around lines 371 - 387, The fetch method
currently catches git.GitCommandError and returns str(e), which prevents the
`@retry` decorator on fetch from seeing failures; remove the return str(e) and
re-raise the exception (e.g. raise or raise from e) so tenacity can retry
origin.fetch; keep the Optional[str] surface by moving error-to-string
conversion into the caller fetch_if_required() after fetch finishes (catch
exceptions there and return the error message), and ensure references to
repo.remotes.origin.fetch and the `@retry-decorated` fetch() are updated
accordingly.
🧹 Nitpick comments (1)
src/snakemake/sourcecache.py (1)

784-805: Chain the underlying exception in both _open() handlers.

These two wrappers drop the original traceback right where the new retry path was added. Use raise ... from e in both handlers so the final failure still points to the underlying Git or I/O error.

Suggested fix
         except git.GitCommandError as e:
             raise WorkflowError(
                 f"Failed to get local git source file {log_path}: {e}. "
                 "Is the local git clone up to date?"
-            )
+            ) from e
@@
         except Exception as e:
-            raise WorkflowError(
-                f"Failed to open source file {log_path}",
-                e,
-            )
+            raise WorkflowError(f"Failed to open source file {log_path}") from e

Based on learnings: use exception chaining (raise NewError(...) from original) to preserve the original exception context when re-raising or wrapping exceptions in Python files.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/sourcecache.py` around lines 784 - 805, The WorkflowError
wrappers in the _open() handlers drop original tracebacks; update both exception
raises (the one catching git.GitCommandError as e and the generic except
Exception as e that wraps open(...)) to chain the original exception using
"raise WorkflowError(... ) from e" so the underlying git or I/O exception
context is preserved when re-raising.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/snakemake/sourcecache.py`:
- Line 13: The retry decorator applied to _open() is incorrect: replace the
non-existent logger.INFO with the proper logging level constant (e.g.,
logging.INFO) or import the logging module and use logging.INFO when calling
after_log; change stop_after_attempt(1) to allow retries (e.g.,
stop_after_attempt(3) or remove the single-attempt stopper) and add reraise=True
to the retry(...) args so that exhausted retries surface the original
WorkflowError instead of a tenacity.RetryError; update the decorator import
usage (retry, stop_after_attempt, wait_exponential, after_log) on the _open
function accordingly.
- Around line 371-387: The fetch method currently catches git.GitCommandError
and returns str(e), which prevents the `@retry` decorator on fetch from seeing
failures; remove the return str(e) and re-raise the exception (e.g. raise or
raise from e) so tenacity can retry origin.fetch; keep the Optional[str] surface
by moving error-to-string conversion into the caller fetch_if_required() after
fetch finishes (catch exceptions there and return the error message), and ensure
references to repo.remotes.origin.fetch and the `@retry-decorated` fetch() are
updated accordingly.

---

Nitpick comments:
In `@src/snakemake/sourcecache.py`:
- Around line 784-805: The WorkflowError wrappers in the _open() handlers drop
original tracebacks; update both exception raises (the one catching
git.GitCommandError as e and the generic except Exception as e that wraps
open(...)) to chain the original exception using "raise WorkflowError(... ) from
e" so the underlying git or I/O exception context is preserved when re-raising.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 70679b0b-215b-450b-a69d-30e4b3c45ed6

📥 Commits

Reviewing files that changed from the base of the PR and between 1d0472c and 8d17aea.

📒 Files selected for processing (1)
  • src/snakemake/sourcecache.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/snakemake/sourcecache.py (1)

382-388: ⚠️ Potential issue | 🟠 Major

Exception swallowing prevents tenacity from retrying.

The try/except block catches git.GitCommandError and returns str(e) instead of raising. Tenacity only retries on raised exceptions, so transient fetch failures will never be retried—the method "succeeds" from tenacity's perspective by returning a string.

Either re-raise the exception to let tenacity retry, or move the error handling outside the decorated method.

Option 1: Let tenacity retry by removing inner exception handling
     `@retry`(wait=wait_exponential(multiplier=2, min=3), stop=stop_after_attempt(3))
-    def fetch(self) -> Optional[str]:
+    def fetch(self) -> None:
         import git

         if self._fetched:
             # up to date, nothing to do
             return

         logger.info(
             f"Fetching latest changes of {self.host}/{self.repo_name} to {self.repo_clone}"
         )
-        try:
-            # arg is needed in order to have a refspec, associated issue with
-            # workaround is here: https://github.com/gitpython-developers/GitPython/issues/296
-            self.repo.remotes.origin.fetch("+refs/heads/*:refs/heads/*")
-        except git.GitCommandError as e:
-            return str(e)
+        # arg is needed in order to have a refspec, associated issue with
+        # workaround is here: https://github.com/gitpython-developers/GitPython/issues/296
+        self.repo.remotes.origin.fetch("+refs/heads/*:refs/heads/*")
         self._fetched = True

If callers need the Optional[str] contract for error reporting, wrap the call site with error handling instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/sourcecache.py` around lines 382 - 388, The fetch block
currently swallows git.GitCommandError by returning str(e), preventing tenacity
retries; change the handler in the method that calls
self.repo.remotes.origin.fetch (the method containing that try/except) to not
return the exception string—either remove the try/except so the
git.GitCommandError is propagated to tenacity, or re-raise the caught exception
(raise) instead of returning it; if you must preserve an Optional[str] return
for callers, move the try/except to the call site and convert the exception to a
string there so the tenacity-decorated function only raises on failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/snakemake/sourcecache.py`:
- Around line 382-388: The fetch block currently swallows git.GitCommandError by
returning str(e), preventing tenacity retries; change the handler in the method
that calls self.repo.remotes.origin.fetch (the method containing that
try/except) to not return the exception string—either remove the try/except so
the git.GitCommandError is propagated to tenacity, or re-raise the caught
exception (raise) instead of returning it; if you must preserve an Optional[str]
return for callers, move the try/except to the call site and convert the
exception to a string there so the tenacity-decorated function only raises on
failures.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c50d2054-05e2-4617-ac78-08ed328637a6

📥 Commits

Reviewing files that changed from the base of the PR and between 8d17aea and 6ada303.

📒 Files selected for processing (1)
  • src/snakemake/sourcecache.py

@johanneskoester johanneskoester changed the title perf: switch reretry with tenacity perf: switch reretry with tenacity; decouple container classes (with Python 3.7 compat for old scripts) from rest of the codebase (enabling moving to newer python versions) Mar 11, 2026
@johanneskoester johanneskoester enabled auto-merge (squash) March 11, 2026 15:44
@johanneskoester johanneskoester merged commit ffb19e7 into snakemake:main Mar 11, 2026
56 of 59 checks passed
@fgvieira fgvieira deleted the retry branch March 11, 2026 16:51
for x in toclone
)
elif strip_constraints:
self.extend(map(strip_wildcard_constraints, toclone))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strip_wildcard_constraints havn't been imported.

@property
def size_tempfiles(self):
async def is_temp(iofile):
return is_flagged(iofile, "temp")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_flagged is't imported yet


if self.is_callable():
assert callable(self._file)
f = self._file(Namedlist(fromdict=wildcards))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Namedlist is moved to other files. It is not ye imported.

Copy link
Copy Markdown
Contributor

@Hocnonsense Hocnonsense left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fgvieira @johanneskoester
After moving Namedlist to a separate file, it seems that some functions are used before being imported. These areas also don’t appear to be covered by tests. Would it make sense to remove them?

@fgvieira fgvieira self-assigned this Mar 13, 2026
@fgvieira fgvieira added the enhancement New feature or request label Mar 13, 2026
johanneskoester pushed a commit that referenced this pull request Mar 13, 2026
🤖 I have created a release *beep* *boop*
---


##
[9.17.0](v9.16.3...v9.17.0)
(2026-03-13)


### Features

* Allow storing snakemake metadata in files or databases
([#4012](#4012))
([dd75f31](dd75f31))
* Allow to specify comparison command per-unit test
([#3956](#3956))
([b88171c](b88171c))
* job table orderd topological when run is started
([#4018](#4018))
([75cf506](75cf506))
* lambda functions for priority in rules
([#3253](#3253))
([d2aa226](d2aa226))
* Make on... directive of modules accessible
([#4050](#4050))
([e9f2e1c](e9f2e1c))


### Bug Fixes

* adjust conda tests to not fail on apple silicon; fix
[#4040](#4040)
([#4049](#4049))
([f5b0142](f5b0142))
* allow "--containerize apptainer" to output apptainer format instead of
dockerfile ([#4030](#4030))
([f5cac30](f5cac30))
* apptainer command not recognized when singularity is absent
([#4010](#4010))
([b8162e2](b8162e2))
* capture stderr when tests fail
([#3995](#3995))
([97d74ba](97d74ba))
* **docs:** make Data-dependent conditional execution a complete example
([#4043](#4043))
([3a1d7f2](3a1d7f2))
* don't build the DAG when running unlock. Fixes
[#4000](#4000) and
[#198](#198)
([#4007](#4007))
([acf79fd](acf79fd))
* Ensure pixi tasks may be run as advertised
([#4046](#4046))
([88253c2](88253c2))
* fix checkpoint handling corner cases
([#3870](#3870) and
[#3559](#3559))
([#4015](#4015))
([63f4257](63f4257))
* issue 3642
([#4054](#4054))
([76e6fc2](76e6fc2))
* issue 3815
([#4026](#4026))
([b0eec96](b0eec96))
* logging None in shellcmd context causes error
([#4064](#4064))
([d0652cd](d0652cd))
* lookup function returns default value for empty DataFrame queries
([#4056](#4056))
([f71de97](f71de97))
* make `cache: omit-software` a rule specific property
([#4085](#4085))
([034a9e7](034a9e7))
* reduce number of tests leaving temporary files behind
([#4033](#4033))
([a3a1c97](a3a1c97))
* regression in dynamic resource handling
([#4038](#4038))
([f2c554a](f2c554a))
* somewhat shorter announce message
([#4080](#4080))
([57efc71](57efc71))


### Performance Improvements

* switch reretry with tenacity; decouple container classes (with Python
3.7 compat for old scripts) from rest of the codebase (enabling moving
to newer python versions)
([#4032](#4032))
([ffb19e7](ffb19e7))


### Documentation

* Add AI-assisted contributions policy to contributing guidelines
([#4051](#4051))
([dd70526](dd70526))
* **codebase:** Update & simplify plugin architecture section
([#4052](#4052))
([176cf63](176cf63))
* Correct workflow.source_path() description in documentation
([#4036](#4036))
([45883c5](45883c5))
* fixed wrong code example for collect() function
([#4037](#4037))
([5c85ed8](5c85ed8))
* Minor docs improvements
([#4089](#4089))
([29ea226](29ea226))
* switch to sphinx_design for tabs
([#3976](#3976))
([9674614](9674614))
* typo in the migration table breaking a pip install command
([#4024](#4024))
([66f9dda](66f9dda))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
johanneskoester added a commit that referenced this pull request Mar 13, 2026
…nd snakemake.io.container (#4082)

fix bugs introduced in #4032 

### QC
<!-- Make sure that you can tick the boxes below. -->

* [ ] The PR contains a test case for the changes or the changes are
already covered by an existing test case.
* [x] The change does neither modify the language nor the behavior or
functionalities of Snakemake.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Improved detection and handling of function-based file paths and
wildcard resolution in IO workflows.

* **New Features**
* Added a property to report temporary file sizes for input collections.

* **Chores**
* Strengthened internal type hints and forward-reference typing to
improve reliability and maintainability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Johannes Köster <johannes.koester@tu-dortmund.de>
fgvieira added a commit to snakemake/snakemake-wrappers that referenced this pull request Mar 26, 2026
<!-- Ensure that the PR title follows conventional commit style (<type>:
<description>)-->
<!-- Possible types are here:
https://github.com/commitizen/conventional-commit-types/blob/master/index.json
-->

Fixes wrapper after changes in snakemake/snakemake#4032

<!-- Add a description of your PR here-->

### QC
<!-- Make sure that you can tick the boxes below. -->

* [x] I confirm that I have followed the [documentation for contributing
to
`snakemake-wrappers`](https://snakemake-wrappers.readthedocs.io/en/stable/contributing.html).

While the contributions guidelines are more extensive, please
particularly ensure that:
* [ ] `test.py` was updated to call any added or updated example rules
in a `Snakefile`
* [ ] `input:` and `output:` file paths in the rules can be chosen
arbitrarily
* [ ] wherever possible, command line arguments are inferred and set
automatically (e.g. based on file extensions in `input:` or `output:`)
* [ ] temporary files are either written to a unique hidden folder in
the working directory, or (better) stored where the Python function
`tempfile.gettempdir()` points to
* [ ] the `meta.yaml` contains a link to the documentation of the
respective tool or command under `url:`
* [ ] conda environments use a minimal amount of channels and packages,
in recommended ordering


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Refactor**
* Switched to the workflow engine’s managed log redirection and removed
manual logfile clearing and custom stdout/stderr handling.
* **Bug Fixes**
* Reworked command construction and threading logic so single-threaded
tools are capped and inputs/params/outputs are parsed more reliably for
consistent execution and logging.
* **Chore**
  * Updated the documentation link and corrected a minor wording issue.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: David Laehnemann <1379875+dlaehnemann@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Switch dependency reretry

3 participants