Skip to content

fix: Restore python rules changes triggering reruns. (GH: #3213)#3485

Merged
johanneskoester merged 3 commits intosnakemake:mainfrom
royjacobson:fix_python_caching
Sep 4, 2025
Merged

fix: Restore python rules changes triggering reruns. (GH: #3213)#3485
johanneskoester merged 3 commits intosnakemake:mainfrom
royjacobson:fix_python_caching

Conversation

@royjacobson
Copy link
Copy Markdown
Contributor

@royjacobson royjacobson commented Mar 26, 2025

Trying to fix #3213 by caching the source code of python rules (using dis + linecache magic).

I would appreciate any help in how to approach writing a test for this.

cc @corneliusroemer

QC

  • The PR contains a test case for the changes or the changes are already covered by an existing test case.
  • The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

Summary by CodeRabbit

  • New Features
    • Enhanced rule processing to capture both shell command and Python function definitions when available.
    • Expanded the internal rule configuration to include additional source code information, supporting improved debugging and documentation.
    • Introduced a method to retrieve and store source details for rules automatically.
    • New workflow defined with rules for generating files based on time-stamped content.
  • Bug Fixes
    • Added tests to validate the behavior of the new rule processing and ensure expected outputs.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2025

📝 Walkthrough

Walkthrough

This pull request updates how rule source code is captured and utilized. The Persistence._code method now returns a rule’s shell command if present or falls back to the new run_func_src attribute. The Rule class has been extended with the run_func_src attribute, and the Workflow class now includes the get_rule_source method to extract the source code from a rule’s function using the dis module and linecache. The rule creation process assigns this extracted code to run_func_src when applicable.

Changes

File(s) Change Summary
src/snakemake/persistence.py Modified _code() to return shellcmd if available; if not, it now checks and returns run_func_src.
src/snakemake/rules.py Added the run_func_src attribute to the Rule class.
src/snakemake/workflow.py Introduced the get_rule_source method in Workflow to capture rule function source code; updated the rule creation to set run_func_src when applicable.
tests/test_github_issue3213/Snakefile, tests/test_github_issue3213/Snakefile.newver New Snakefiles added defining workflows with rules all, file_a, and file_b.
tests/tests.py Added a new test function test_github_issue3213() to validate behavior related to the GitHub issue.

Assessment against linked issues

Objective (from #3213) Addressed Explanation
Ensure that code changes in Python rules trigger workflow re-runs (#3213) The changes do not address the issue of detecting code changes in Python rules to trigger workflow re-runs.

Possibly related PRs

Suggested reviewers

  • johanneskoester
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/snakemake/workflow.py (1)

1953-1954: Good caching implementation for rule source code.

These lines add the source code caching mechanism by storing the rule's source code in rule.run_func_src when a rule function is defined. This will help prevent unnecessary reruns when rule source hasn't changed meaningfully.

Since you mentioned needing help with testing, I'd suggest a test case that:

  1. Creates a workflow with a Python rule
  2. Runs it once and verifies completion
  3. Runs again without changes to verify it doesn't rerun (up-to-date)
  4. Makes a comment-only change to the rule source
  5. Verifies it still doesn't rerun
  6. Makes a functional change to the source
  7. Verifies it does rerun (rule marked as outdated)

Would you like me to help develop this test case for you?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 216efef and 55ef9d0.

📒 Files selected for processing (3)
  • src/snakemake/persistence.py (1 hunks)
  • src/snakemake/rules.py (1 hunks)
  • src/snakemake/workflow.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest type annotations for functions that are defined inside of functions or methods. Do not suggest type annotation of the `s...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • src/snakemake/persistence.py
  • src/snakemake/rules.py
  • src/snakemake/workflow.py
🧬 Code Definitions (2)
src/snakemake/persistence.py (1)
src/snakemake/workflow.py (2)
  • rule (1735-2018)
  • shellcmd (2230-2235)
src/snakemake/workflow.py (3)
src/snakemake/rules.py (1)
  • lineno (138-139)
src/snakemake/parser.py (1)
  • lineno (74-75)
src/snakemake/persistence.py (1)
  • rule (426-427)
⏰ Context from checks skipped due to timeout of 90000ms (34)
  • GitHub Check: tests (10, windows-latest, py312)
  • GitHub Check: tests (10, ubuntu-latest, py312)
  • GitHub Check: tests (10, ubuntu-latest, py311)
  • GitHub Check: tests (9, windows-latest, py312)
  • GitHub Check: tests (9, ubuntu-latest, py312)
  • GitHub Check: tests (9, ubuntu-latest, py311)
  • GitHub Check: tests (8, windows-latest, py312)
  • GitHub Check: tests (8, ubuntu-latest, py312)
  • GitHub Check: tests (8, ubuntu-latest, py311)
  • GitHub Check: tests (7, windows-latest, py312)
  • GitHub Check: tests (7, ubuntu-latest, py312)
  • GitHub Check: tests (7, ubuntu-latest, py311)
  • GitHub Check: tests (6, macos-latest, py312)
  • GitHub Check: tests (6, windows-latest, py312)
  • GitHub Check: tests (6, ubuntu-latest, py312)
  • GitHub Check: tests (6, ubuntu-latest, py311)
  • GitHub Check: tests (5, macos-latest, py312)
  • GitHub Check: tests (5, windows-latest, py312)
  • GitHub Check: tests (5, ubuntu-latest, py312)
  • GitHub Check: tests (5, ubuntu-latest, py311)
  • GitHub Check: tests (4, macos-latest, py312)
  • GitHub Check: tests (4, windows-latest, py312)
  • GitHub Check: tests (4, ubuntu-latest, py312)
  • GitHub Check: tests (4, ubuntu-latest, py311)
  • GitHub Check: tests (3, windows-latest, py312)
  • GitHub Check: tests (3, ubuntu-latest, py312)
  • GitHub Check: tests (3, ubuntu-latest, py311)
  • GitHub Check: tests (2, windows-latest, py312)
  • GitHub Check: tests (2, ubuntu-latest, py312)
  • GitHub Check: tests (2, ubuntu-latest, py311)
  • GitHub Check: tests (1, macos-latest, py312)
  • GitHub Check: tests (1, windows-latest, py312)
  • GitHub Check: tests (1, ubuntu-latest, py312)
  • GitHub Check: tests (1, ubuntu-latest, py311)
🔇 Additional comments (4)
src/snakemake/rules.py (1)

110-110: Good addition of the run_func_src attribute.

This new attribute will store the source code of Python rule functions, enabling caching for run functions like it's done for shell commands. This change aligns with the PR objective of implementing a source code caching mechanism to prevent unnecessary reruns.

src/snakemake/persistence.py (1)

557-562: Good implementation of source caching for Python run functions.

This change properly extends the _code method to handle both shell commands and Python function source code. The updated comment also correctly reflects this dual capability. The implementation preserves existing behavior for shell commands while adding support for caching Python function source code when available.

This change, combined with the new run_func_src attribute and the get_rule_source method in workflow.py, should effectively address issue #3213 by preventing unnecessary reruns when Python rules are used.

src/snakemake/workflow.py (2)

13-14: Added necessary imports for rule source retrieval.

The new imports of dis and linecache modules are needed for the source code extraction functionality being implemented.


1717-1734: Well-implemented source code extraction method.

This new get_rule_source method extracts the source code of a rule function by:

  1. Using dis.findlinestarts() to locate line numbers in the compiled function's bytecode
  2. Mapping these to original source lines via self.linemaps
  3. Retrieving the actual source lines with linecache.getline()

The comments explain clearly why inspect module couldn't be used and how the implementation works. The function is thoughtfully designed to skip the first instruction (rule declaration line) and only include the actual rule body code.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/snakemake/workflow.py (2)

1717-1735: Implementation for extracting rule function source code looks solid

This new method retrieves the source code of a rule function by using the dis module to analyze bytecode and map line numbers to the original source file. The approach is well documented and handles the edge case of skipping the first line (offset 0) that corresponds to the rule definition line.

I have a minor concern about error handling - the code assumes that sourcefile is always present in self.linemaps and there's no try/except block to handle cases where line numbers might not be present in the map or issues with reading source files.

Consider adding error handling to gracefully handle cases where:

  1. sourcefile isn't in self.linemaps
  2. Line numbers from dis.findlinestarts() aren't in linemap
  3. linecache.getline() fails to read the file
def get_rule_source(self, func):
    # This can't use `inspect` because the functions are compiled into intermediate python code
    # in parser.py and that intermediate source is not available anymore (or desirable).
    # Instead, we're using dis to retrieve the line numbers of the function in the intermediate
    # code and then map it to the original file using `self.linemaps`.
    sourcefile = func.__code__.co_filename
    line_numbers = []
-    linemap = self.linemaps[sourcefile]
+    try:
+        linemap = self.linemaps.get(sourcefile, {})
+        if not linemap:
+            return ""
+        
+        for func_offset, line in dis.findlinestarts(func.__code__):
+            # The first instruction in the compiled function is RESUME, which
+            # with snakemake is mapped to the 'rule: ' line and is not considered
+            # part of the rule source.
+            if func_offset == 0:
+                continue
+            if line in linemap:
+                line_numbers.append(linemap[line])
+        
+        return "".join(
+            [linecache.getline(sourcefile, lineno) for lineno in sorted(line_numbers)]
+        )
+    except Exception as e:
+        logger.debug(f"Could not extract source for rule function: {e}")
+        return ""
-    for func_offset, line in dis.findlinestarts(func.__code__):
-        # The first instruction in the compiled function is RESUME, which
-        # with snakemake is mapped to the 'rule: ' line and is not considered
-        # part of the rule source.
-        if func_offset == 0:
-            continue
-        if line in linemap:
-            line_numbers.append(linemap[line])
-    return "".join(
-        [linecache.getline(sourcefile, lineno) for lineno in sorted(line_numbers)]
-    )

1-2411: Consider adding unit tests for the source code extraction functionality

The PR objectives mention a need for assistance with developing a test for these changes. Implementing a test would be beneficial to verify that the source code extraction works correctly and that it prevents unnecessary reruns of Python rules.

I can help develop a test case that would:

  1. Create a rule with a Python function
  2. Extract and verify the source code
  3. Test that changes to comments or formatting don't trigger reruns
  4. Test that actual code changes do trigger reruns

Would you like me to draft a test implementation for this feature?

🧰 Tools
🪛 Ruff (0.8.2)

68-68: snakemake.exceptions.update_lineno imported but unused

Remove unused import: snakemake.exceptions.update_lineno

(F401)


75-75: snakemake.io.protected imported but unused

Remove unused import

(F401)


76-76: snakemake.io.temp imported but unused

Remove unused import

(F401)


77-77: snakemake.io.temporary imported but unused

Remove unused import

(F401)


78-78: snakemake.io.ancient imported but unused

Remove unused import

(F401)


79-79: snakemake.io.directory imported but unused

Remove unused import

(F401)


81-81: snakemake.io.glob_wildcards imported but unused

Remove unused import

(F401)


82-82: snakemake.io.flag imported but unused

Remove unused import

(F401)


83-83: snakemake.io.touch imported but unused

Remove unused import

(F401)


84-84: snakemake.io.unpack imported but unused

Remove unused import

(F401)


85-85: snakemake.io.local imported but unused

Remove unused import

(F401)


86-86: snakemake.io.pipe imported but unused

Remove unused import

(F401)


87-87: snakemake.io.service imported but unused

Remove unused import

(F401)


88-88: snakemake.io.repeat imported but unused

Remove unused import

(F401)


89-89: snakemake.io.report imported but unused

Remove unused import

(F401)


90-90: snakemake.io.multiext imported but unused

Remove unused import

(F401)


91-91: snakemake.io.ensure imported but unused

Remove unused import

(F401)


99-99: snakemake.script.script imported but unused

Remove unused import: snakemake.script.script

(F401)


100-100: snakemake.notebook.notebook imported but unused

Remove unused import: snakemake.notebook.notebook

(F401)


101-101: snakemake.wrapper.wrapper imported but unused

Remove unused import: snakemake.wrapper.wrapper

(F401)


102-102: snakemake.cwl.cwl imported but unused

Remove unused import: snakemake.cwl.cwl

(F401)


103-103: snakemake.template_rendering.render_template imported but unused

Remove unused import: snakemake.template_rendering.render_template

(F401)


104-104: snakemake_interface_common.utils.not_iterable imported but unused

Remove unused import: snakemake_interface_common.utils.not_iterable

(F401)


112-112: snakemake.common.Rules imported but unused

Remove unused import: snakemake.common.Rules

(F401)


132-132: snakemake.api imported but unused

Remove unused import: snakemake.api

(F401)


355-356: Use a single if statement instead of nested if statements

(SIM102)


397-397: Redefinition of unused touch from line 83

(F811)


750-754: Do not assign a lambda expression, use a def

Rewrite relpath as a def

(E731)


908-912: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


1038-1038: Redefinition of unused cwl from line 102

(F811)


1682-1682: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


1691-1691: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


1702-1702: Redefinition of unused report from line 89

(F811)


1857-1859: Multiple isinstance calls for r, merge into a single call

Merge isinstance calls for r

(SIM101)


1885-1886: Use a single if statement instead of nested if statements

(SIM102)


1945-1945: Avoid inequality comparisons to False; use if ruleinfo.container_img: for truth checks

Replace with ruleinfo.container_img

(E712)


2114-2116: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


2135-2138: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


2239-2239: Redefinition of unused script from line 99

(F811)


2239-2239: Redefinition of unused script from line 99

(F811)


2246-2246: Redefinition of unused notebook from line 100

(F811)


2246-2246: Redefinition of unused notebook from line 100

(F811)


2253-2253: Redefinition of unused wrapper from line 101

(F811)


2253-2253: Redefinition of unused wrapper from line 101

(F811)


2267-2267: Redefinition of unused cwl from line 102

(F811)


2267-2267: Redefinition of unused cwl from line 102

(F811)


2335-2339: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


2354-2358: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


2361-2365: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 55ef9d0 and ea0782f.

📒 Files selected for processing (3)
  • src/snakemake/persistence.py (1 hunks)
  • src/snakemake/rules.py (1 hunks)
  • src/snakemake/workflow.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/snakemake/rules.py
  • src/snakemake/persistence.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest type annotations for functions that are defined inside of functions or methods. Do not suggest type annotation of the `s...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • src/snakemake/workflow.py
🧬 Code Definitions (1)
src/snakemake/workflow.py (3)
src/snakemake/rules.py (1)
  • lineno (138-139)
src/snakemake/parser.py (1)
  • lineno (74-75)
src/snakemake/persistence.py (1)
  • rule (426-427)
⏰ Context from checks skipped due to timeout of 90000ms (36)
  • GitHub Check: tests (10, macos-latest, py312)
  • GitHub Check: tests (10, windows-latest, py312)
  • GitHub Check: tests (10, ubuntu-latest, py312)
  • GitHub Check: tests (10, ubuntu-latest, py311)
  • GitHub Check: tests (9, windows-latest, py312)
  • GitHub Check: tests (9, ubuntu-latest, py312)
  • GitHub Check: tests (9, ubuntu-latest, py311)
  • GitHub Check: tests (8, windows-latest, py312)
  • GitHub Check: tests (8, ubuntu-latest, py312)
  • GitHub Check: tests (8, ubuntu-latest, py311)
  • GitHub Check: tests (7, windows-latest, py312)
  • GitHub Check: tests (7, ubuntu-latest, py312)
  • GitHub Check: tests (7, ubuntu-latest, py311)
  • GitHub Check: tests (6, windows-latest, py312)
  • GitHub Check: tests (6, ubuntu-latest, py312)
  • GitHub Check: tests (6, ubuntu-latest, py311)
  • GitHub Check: tests (5, windows-latest, py312)
  • GitHub Check: tests (5, ubuntu-latest, py312)
  • GitHub Check: tests (5, ubuntu-latest, py311)
  • GitHub Check: tests (4, macos-latest, py312)
  • GitHub Check: tests (4, windows-latest, py312)
  • GitHub Check: tests (4, ubuntu-latest, py312)
  • GitHub Check: tests (4, ubuntu-latest, py311)
  • GitHub Check: tests (3, macos-latest, py312)
  • GitHub Check: tests (3, windows-latest, py312)
  • GitHub Check: tests (3, ubuntu-latest, py312)
  • GitHub Check: tests (3, ubuntu-latest, py311)
  • GitHub Check: tests (2, macos-latest, py312)
  • GitHub Check: tests (2, windows-latest, py312)
  • GitHub Check: tests (2, ubuntu-latest, py312)
  • GitHub Check: tests (2, ubuntu-latest, py311)
  • GitHub Check: tests (1, macos-latest, py312)
  • GitHub Check: tests (1, windows-latest, py312)
  • GitHub Check: tests (1, ubuntu-latest, py312)
  • GitHub Check: tests (1, ubuntu-latest, py311)
  • GitHub Check: apidocs
🔇 Additional comments (2)
src/snakemake/workflow.py (2)

13-14: New imports support source code extraction for rules

New imports for the Python standard library modules dis and linecache are added to support the source code extraction functionality for rule functions.


1955-1956: Storing source code for rule run functions

This change assigns the extracted source code to the rule object when a run function exists, completing the key part of this PR's functionality.

@johanneskoester
Copy link
Copy Markdown
Contributor

I guess having a testcase with a rule that has a run directive, which first runs the workflow with cleanup=False (an argument of the test suites run function), and then a second run that tries to run again should be a start. There one would have to use python's MagickMock to hook into the dag in order to check that no rerun is triggered. Then, in a second rerun, one could first alter the run directive code trivially via some replace command and then check via the same way if a rerun is triggered.

Also, should be assured that the new approach really captures the function code and not lines above or below the actual function.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/tests.py (1)

1907-1935: Excellent test implementation for the caching issue.

This test follows the recommended approach perfectly:

  1. It runs the workflow initially and captures the outputs
  2. It clears the linecache to ensure new source code is picked up
  3. It replaces the Snakefile with a modified version
  4. It runs the workflow again without forcing to test if the change is detected naturally
  5. It runs a third time with forceall=True to get fresh outputs
  6. The assertions verify the expected behavior

The test is well-structured, but consider using context managers for file operations:

- ts_a1, ts_b1 = [open(os.path.join(tmpdir, f)).read() for f in ['file_a.txt', 'file_b.txt']]
+ ts_a1, ts_b1 = [open(os.path.join(tmpdir, f)).read() and f.close() for f in ['file_a.txt', 'file_b.txt']]

Or better:

with open(os.path.join(tmpdir, 'file_a.txt')) as f_a, open(os.path.join(tmpdir, 'file_b.txt')) as f_b:
    ts_a1, ts_b1 = f_a.read(), f_b.read()

And similarly for the other file reading operations.

🧰 Tools
🪛 Ruff (0.8.2)

1912-1912: Use a context manager for opening files

(SIM115)


1920-1920: Use a context manager for opening files

(SIM115)


1923-1923: Use a context manager for opening files

(SIM115)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ea0782f and f85e011.

📒 Files selected for processing (4)
  • src/snakemake/workflow.py (3 hunks)
  • tests/test_github_issue3213/Snakefile (1 hunks)
  • tests/test_github_issue3213/Snakefile.newver (1 hunks)
  • tests/tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/snakemake/workflow.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest type annotations for functions that are defined inside of functions or methods. Do not suggest type annotation of the `s...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • tests/tests.py
🧬 Code Definitions (1)
tests/tests.py (1)
tests/common.py (2)
  • run (151-494)
  • dpath (32-35)
🪛 Ruff (0.8.2)
tests/tests.py

1912-1912: Use a context manager for opening files

(SIM115)


1920-1920: Use a context manager for opening files

(SIM115)


1923-1923: Use a context manager for opening files

(SIM115)

⏰ Context from checks skipped due to timeout of 90000ms (34)
  • GitHub Check: tests (10, windows-latest, py312)
  • GitHub Check: tests (10, ubuntu-latest, py312)
  • GitHub Check: tests (10, ubuntu-latest, py311)
  • GitHub Check: tests (9, windows-latest, py312)
  • GitHub Check: tests (9, ubuntu-latest, py312)
  • GitHub Check: tests (9, ubuntu-latest, py311)
  • GitHub Check: tests (8, windows-latest, py312)
  • GitHub Check: tests (8, ubuntu-latest, py312)
  • GitHub Check: tests (8, ubuntu-latest, py311)
  • GitHub Check: tests (7, windows-latest, py312)
  • GitHub Check: tests (7, ubuntu-latest, py312)
  • GitHub Check: tests (7, ubuntu-latest, py311)
  • GitHub Check: tests (6, windows-latest, py312)
  • GitHub Check: tests (6, ubuntu-latest, py312)
  • GitHub Check: tests (6, ubuntu-latest, py311)
  • GitHub Check: tests (5, windows-latest, py312)
  • GitHub Check: tests (5, ubuntu-latest, py312)
  • GitHub Check: tests (5, ubuntu-latest, py311)
  • GitHub Check: tests (4, macos-latest, py312)
  • GitHub Check: tests (4, windows-latest, py312)
  • GitHub Check: tests (4, ubuntu-latest, py312)
  • GitHub Check: tests (4, ubuntu-latest, py311)
  • GitHub Check: tests (3, macos-latest, py312)
  • GitHub Check: tests (3, windows-latest, py312)
  • GitHub Check: tests (3, ubuntu-latest, py312)
  • GitHub Check: tests (3, ubuntu-latest, py311)
  • GitHub Check: tests (2, macos-latest, py312)
  • GitHub Check: tests (2, windows-latest, py312)
  • GitHub Check: tests (2, ubuntu-latest, py312)
  • GitHub Check: tests (2, ubuntu-latest, py311)
  • GitHub Check: tests (1, macos-latest, py312)
  • GitHub Check: tests (1, windows-latest, py312)
  • GitHub Check: tests (1, ubuntu-latest, py312)
  • GitHub Check: tests (1, ubuntu-latest, py311)
🔇 Additional comments (2)
tests/test_github_issue3213/Snakefile (1)

1-24: Test implementation looks good for checking rule source code caching.

This file creates a simple workflow with three rules, each properly defined with input/output relationships. The file_a and file_b rules each use a run directive to write their respective output files with timestamps using time.time_ns(). This setup will effectively test if changes to rule source code are properly detected or cached.

The difference between this file and Snakefile.newver is that file_b writes "B" in this version, while it writes "D" in the newer version - perfect for testing source code change detection.

tests/test_github_issue3213/Snakefile.newver (1)

1-24: Good implementation of the modified Snakefile for testing.

This is a well-designed variation of the original Snakefile, with the only difference being that file_b now writes "D" instead of "B" to the output file. This small change is perfect for testing whether Snakemake properly detects changes to the source code in Python run blocks.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/tests.py (1)

1907-1951: LGTM! Good test implementation for the Python rules caching mechanism.

This test effectively validates that the caching solution works correctly by:

  1. Running a workflow and capturing output
  2. Clearing the linecache (crucial for the fix being tested)
  3. Changing the Snakefile and verifying only rules with changed code are rerun
  4. Verifying that forceall works as expected

Minor improvement opportunity: consider using context managers for file handling operations.

-        ts_a1, ts_b1 = [open(os.path.join(tmpdir, f)).read() for f in output_files]
+        ts_a1, ts_b1 = [open(os.path.join(tmpdir, f)).read() for f in output_files]

-        ts_a2, ts_b2 = [open(os.path.join(tmpdir, f)).read() for f in output_files]
+        with open(os.path.join(tmpdir, f)) as file_a, open(os.path.join(tmpdir, f)) as file_b:
+            ts_a2 = file_a.read()
+            ts_b2 = file_b.read()

-        ts_a3, ts_b3 = [open(os.path.join(tmpdir, f)).read() for f in output_files]
+        ts_a3, ts_b3 = [open(os.path.join(tmpdir, f)).read() for f in output_files]

Actually, a cleaner solution would be to create a helper function:

def read_output_files(directory, files):
    results = []
    for f in files:
        with open(os.path.join(directory, f)) as file:
            results.append(file.read())
    return results

# Then use it like:
ts_a1, ts_b1 = read_output_files(tmpdir, output_files)
🧰 Tools
🪛 Ruff (0.8.2)

1914-1914: Use a context manager for opening files

(SIM115)


1930-1930: Use a context manager for opening files

(SIM115)


1939-1939: Use a context manager for opening files

(SIM115)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f85e011 and 2a290d5.

📒 Files selected for processing (3)
  • tests/test_github_issue3213/Snakefile (1 hunks)
  • tests/test_github_issue3213/Snakefile.newver (1 hunks)
  • tests/tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/test_github_issue3213/Snakefile.newver
  • tests/test_github_issue3213/Snakefile
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest type annotations for functions that are defined inside of functions or methods. Do not suggest type annotation of the `s...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • tests/tests.py
🧬 Code Definitions (1)
tests/tests.py (1)
tests/common.py (2)
  • run (151-494)
  • dpath (32-35)
🪛 Ruff (0.8.2)
tests/tests.py

1914-1914: Use a context manager for opening files

(SIM115)


1930-1930: Use a context manager for opening files

(SIM115)


1939-1939: Use a context manager for opening files

(SIM115)

⏰ Context from checks skipped due to timeout of 90000ms (36)
  • GitHub Check: tests (10, macos-latest, py312)
  • GitHub Check: tests (10, windows-latest, py312)
  • GitHub Check: tests (10, ubuntu-latest, py312)
  • GitHub Check: tests (10, ubuntu-latest, py311)
  • GitHub Check: tests (9, windows-latest, py312)
  • GitHub Check: tests (9, ubuntu-latest, py312)
  • GitHub Check: tests (9, ubuntu-latest, py311)
  • GitHub Check: tests (8, windows-latest, py312)
  • GitHub Check: tests (8, ubuntu-latest, py312)
  • GitHub Check: tests (8, ubuntu-latest, py311)
  • GitHub Check: tests (7, windows-latest, py312)
  • GitHub Check: tests (7, ubuntu-latest, py312)
  • GitHub Check: tests (7, ubuntu-latest, py311)
  • GitHub Check: tests (6, windows-latest, py312)
  • GitHub Check: tests (6, ubuntu-latest, py312)
  • GitHub Check: tests (6, ubuntu-latest, py311)
  • GitHub Check: tests (5, windows-latest, py312)
  • GitHub Check: tests (5, ubuntu-latest, py312)
  • GitHub Check: tests (5, ubuntu-latest, py311)
  • GitHub Check: tests (4, macos-latest, py312)
  • GitHub Check: tests (4, windows-latest, py312)
  • GitHub Check: tests (4, ubuntu-latest, py312)
  • GitHub Check: tests (4, ubuntu-latest, py311)
  • GitHub Check: tests (3, macos-latest, py312)
  • GitHub Check: tests (3, windows-latest, py312)
  • GitHub Check: tests (3, ubuntu-latest, py312)
  • GitHub Check: tests (3, ubuntu-latest, py311)
  • GitHub Check: tests (2, macos-latest, py312)
  • GitHub Check: tests (2, windows-latest, py312)
  • GitHub Check: tests (2, ubuntu-latest, py312)
  • GitHub Check: tests (2, ubuntu-latest, py311)
  • GitHub Check: tests (1, macos-latest, py312)
  • GitHub Check: tests (1, windows-latest, py312)
  • GitHub Check: tests (1, ubuntu-latest, py312)
  • GitHub Check: tests (1, ubuntu-latest, py311)
  • GitHub Check: apidocs

@royjacobson
Copy link
Copy Markdown
Contributor Author

I didn't hook into the DAG directly but I validate that the change in rules was detected using timestamps + different results.

About how to validate that it's actually the intended source lines - I'm not sure. linemaps is very fragile and I already noticed one case where it was wrong (it maps the pass line of a norun rule as if it was part of the original code). On the other hand I feel like the risk/cost tradeoff here isn't very high - even if there are weird edge cases it will only manifest in a triggered re-run on code change, and the benefit of bringing this back is very large IMO.

(in the future I believe a more robust approach would be to improve the source tracking in parser.py)

@royjacobson
Copy link
Copy Markdown
Contributor Author

@johanneskoester friendly ping :)

@royjacobson
Copy link
Copy Markdown
Contributor Author

@johanneskoester ?

@royjacobson
Copy link
Copy Markdown
Contributor Author

Hi @johanneskoester, is there anything I could help with to move this PR forward?
Thanks!

Copy link
Copy Markdown
Contributor

@johanneskoester johanneskoester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic, thanks a lot!

@johanneskoester
Copy link
Copy Markdown
Contributor

Sorry for the delay, there is a huge backlog to catch up with!

@johanneskoester
Copy link
Copy Markdown
Contributor

Best is always to ping me via discord.

@johanneskoester johanneskoester merged commit 2f663be into snakemake:main Sep 4, 2025
48 checks passed
johanneskoester pushed a commit that referenced this pull request Sep 5, 2025
🤖 I have created a release *beep* *boop*
---


##
[9.11.0](v9.10.1...v9.11.0)
(2025-09-05)


### Features

* add landing page to the report containing custom metadata defined with
a YTE yaml template
([#3452](#3452))
([c754d80](c754d80))
* Issue a warning when unsupported characters are used in wildcard names
([#1587](#1587))
([fa57355](fa57355))


### Bug Fixes

* avoid checking output files in immediate-submit mode
([#3569](#3569))
([58c42cf](58c42cf))
* clarify --keep-going flag help text to distinguish runtime vs DAG
construction errors
([#3705](#3705))
([a3a8ef4](a3a8ef4))
* enable docstring assignment in `use rule ... with:` block
([#3710](#3710))
([2191962](2191962))
* fix missing attribute error in greedy scheduler settings when using
touch, dryrun or immediate submit.
([6471004](6471004))
* only trigger script with CODE label
([#3707](#3707))
([2d01f8c](2d01f8c))
* parser.py incorrectly parsing multiline f-strings
([#3701](#3701))
([06ece76](06ece76))
* parsing ordinary yaml strings
([#3506](#3506))
([ddf334e](ddf334e))
* remove temp files when using checkpoints
([#3716](#3716))
([5ff3e20](5ff3e20))
* Restore python rules changes triggering reruns. (GH:
[#3213](#3213))
([#3485](#3485))
([2f663be](2f663be))
* unit testing
([#3680](#3680))
([06ba7e6](06ba7e6))
* use queue_input_wait_time when updating queue input jobs
([#3712](#3712))
([a945a19](a945a19))


### Documentation

* output files within output directories is an error
([#2848](#2848))
([#2913](#2913))
([37272e5](37272e5))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
kjohnsen pushed a commit to kjohnsen/snakemake that referenced this pull request Dec 15, 2025
) (snakemake#3485)

Trying to fix snakemake#3213 by caching the source code of python rules (using
`dis` + `linecache` magic).

I would appreciate any help in how to approach writing a test for this.

cc @corneliusroemer 

### QC
<!-- Make sure that you can tick the boxes below. -->

* [x] The PR contains a test case for the changes or the changes are
already covered by an existing test case.
* [x] The documentation (`docs/`) is updated to reflect the changes or
this is not necessary (e.g. if the change does neither modify the
language nor the behavior or functionalities of Snakemake).


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced rule processing to capture both shell command and Python
function definitions when available.
- Expanded the internal rule configuration to include additional source
code information, supporting improved debugging and documentation.
- Introduced a method to retrieve and store source details for rules
automatically.
- New workflow defined with rules for generating files based on
time-stamped content.
- **Bug Fixes**
- Added tests to validate the behavior of the new rule processing and
ensure expected outputs.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
kjohnsen pushed a commit to kjohnsen/snakemake that referenced this pull request Dec 15, 2025
🤖 I have created a release *beep* *boop*
---


##
[9.11.0](snakemake/snakemake@v9.10.1...v9.11.0)
(2025-09-05)


### Features

* add landing page to the report containing custom metadata defined with
a YTE yaml template
([snakemake#3452](snakemake#3452))
([c754d80](snakemake@c754d80))
* Issue a warning when unsupported characters are used in wildcard names
([snakemake#1587](snakemake#1587))
([fa57355](snakemake@fa57355))


### Bug Fixes

* avoid checking output files in immediate-submit mode
([snakemake#3569](snakemake#3569))
([58c42cf](snakemake@58c42cf))
* clarify --keep-going flag help text to distinguish runtime vs DAG
construction errors
([snakemake#3705](snakemake#3705))
([a3a8ef4](snakemake@a3a8ef4))
* enable docstring assignment in `use rule ... with:` block
([snakemake#3710](snakemake#3710))
([2191962](snakemake@2191962))
* fix missing attribute error in greedy scheduler settings when using
touch, dryrun or immediate submit.
([6471004](snakemake@6471004))
* only trigger script with CODE label
([snakemake#3707](snakemake#3707))
([2d01f8c](snakemake@2d01f8c))
* parser.py incorrectly parsing multiline f-strings
([snakemake#3701](snakemake#3701))
([06ece76](snakemake@06ece76))
* parsing ordinary yaml strings
([snakemake#3506](snakemake#3506))
([ddf334e](snakemake@ddf334e))
* remove temp files when using checkpoints
([snakemake#3716](snakemake#3716))
([5ff3e20](snakemake@5ff3e20))
* Restore python rules changes triggering reruns. (GH:
[snakemake#3213](snakemake#3213))
([snakemake#3485](snakemake#3485))
([2f663be](snakemake@2f663be))
* unit testing
([snakemake#3680](snakemake#3680))
([06ba7e6](snakemake@06ba7e6))
* use queue_input_wait_time when updating queue input jobs
([snakemake#3712](snakemake#3712))
([a945a19](snakemake@a945a19))


### Documentation

* output files within output directories is an error
([snakemake#2848](snakemake#2848))
([snakemake#2913](snakemake#2913))
([37272e5](snakemake@37272e5))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Code changes in python rules no longer trigger re-running of workflows

2 participants