fix: parser.py incorrectly parsing multiline f-strings by ChristianRomberg · Pull Request #3701 · snakemake/snakemake

ChristianRomberg · 2025-08-17T12:35:35Z

Fix multiline F-Strings removing lines
fixes #3700

QC

The PR contains a test case for the changes or the changes are already covered by an existing test case.
The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

Summary by CodeRabbit

Bug Fixes
- Correctly parses multi-line f-strings (Python 3.12+), preventing crashes, truncation, and misaligned spans.
- Preserves pure-literal middle lines in multi-line f-strings and handles complex/nested cases more reliably.
Refactor
- Reworked f-string parsing to aggregate content across full token spans for more accurate line handling without changing public APIs.
Tests
- Added regression test for preserving literal middle lines in multi-line f-strings (issue Multiline F-strings remove some lines #3700) and updated expectations.

Fix multiline F-Strings removing lines fixes snakemake#3700

coderabbitai · 2025-08-17T12:35:45Z

📝 Walkthrough

Walkthrough

Rework of f-string parsing in src/snakemake/parser.py: added split_token_lines and changed parse_fstring to aggregate per-line fragments across an f-string's token sequence and reconstruct the full multi-line span. Test expectation in tests/test_fstring/Snakefile updated accordingly.

Changes

Cohort / File(s)	Summary
Parser: f-string reconstruction `src/snakemake/parser.py`	Added `split_token_lines(token: tokenize.TokenInfo)` and reimplemented `parse_fstring` to collect token content per-line across the f-string tokens, compute outer start/end, trim boundary fragments only, join the full span to reconstruct the f-string fragment, update `self.cmd` when needed, and return the fragment.
Tests: f-string expectation `tests/test_fstring/Snakefile`	Updated test to assert preservation of a pure-literal middle line in a multi-line f-string (split previously combined line into `"hello,"` and `"snakemake"` lines) and added a regression test for issue #3700.

Sequence Diagram(s)

sequenceDiagram
  participant Tokenizer
  participant Parser
  participant SplitLines as split_token_lines

  Tokenizer->>Parser: emit initial f-string TokenInfo
  Parser->>SplitLines: split_token_lines(initial_token)
  SplitLines-->>Parser: map[line_no -> fragment]
  loop remaining f-string tokens
    Tokenizer->>Parser: emit next TokenInfo
    Parser->>SplitLines: split_token_lines(next_token)
    SplitLines-->>Parser: additional fragments
    Parser->>Parser: merge fragments into lines map
  end
  Parser->>Parser: determine overall start/end, trim boundaries
  Parser->>Parser: join sorted lines -> reconstructed f-string
  Parser-->>Tokenizer: return reconstructed fragment

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Assessment against linked issues

Objective	Addressed	Explanation
Fix multi-line f-strings dropping lines without expressions (#3700)	✅

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

ChristianRomberg · 2025-08-17T12:37:39Z

Not sure if this is the best way, but it fixes my minimal example (see #3700 ). Feel free to edit

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (1)

src/snakemake/parser.py (1)

111-169: Fix potential infinite loop and improve line collection logic

The parse_fstring method has several issues that need to be addressed:

Potential infinite loop: Line 132 increments line unconditionally even when token_.end[0] might be less than token_.start[0] (though unlikely in practice).
Incorrect conditional logic: Line 138 uses > when it should be < to keep the longer content.
Line parsing complexity: The nested line iteration (lines 130-142) is overly complex for what should be a simpler operation.

Apply this diff to fix the issues:

 def parse_fstring(self, token: tokenize.TokenInfo):
     """
     only for python >= 3.12, since then python changed the
     parsing manner of f-string, see
     [pep-0701](https://peps.python.org/pep-0701)
 
     Here, we just read where the f-string start and end from tokens.
     Luckily, each token records the content of the line,
     and we can just take what we want there.
     """
     lines_contents = dict()
 
     # collect all lines until the end of the f-string
     def collect_lines(token_: tokenize.TokenInfo):
         lines_contents[token_.start[0]] = token_.line
         # if the token spans multiple lines, we need to collect all of them
         if token_.end[0] != token_.start[0]:
 
             # iterate over all lines that the token spans
             token_string = token_.line
             line = token_.start[0]
-            while line <= token_.end[0]:
+            while line <= token_.end[0] and token_string:
                 end_of_line_index = (token_string.index("\n") + 1) if "\n" in token_string \
                     else len(token_string)
                 lines_content = token_string[: end_of_line_index]
 
                 # don't take the new line content if the one already stored is longer
-                if line not in lines_contents or len(lines_contents[line]) > len(lines_content):
+                if line not in lines_contents or len(lines_contents[line]) < len(lines_content):
                     lines_contents[line] = lines_content
 
                 token_string = token_string[end_of_line_index:]
                 line += 1

🧹 Nitpick comments (1)

src/snakemake/parser.py (1)

133-134: Simplify ternary expression using min()

The ternary expression can be simplified for better readability.

-                end_of_line_index = (token_string.index("\n") + 1) if "\n" in token_string \
-                    else len(token_string)
+                end_of_line_index = (token_string.index("\n") + 1) if "\n" in token_string else len(token_string)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ea7bdb2 and 3a62574.

📒 Files selected for processing (1)

src/snakemake/parser.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

⚙️ CodeRabbit Configuration File

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

src/snakemake/parser.py

🧬 Code Graph Analysis (1)

src/snakemake/parser.py (3)

src/snakemake/gui.py (1)

index (84-95)

src/snakemake/rules.py (1)

snakefile (142-143)

src/snakemake/workflow.py (1)

snakefile (1539-1544)

🔇 Additional comments (2)

src/snakemake/parser.py (2)

156-167: Well-structured f-string boundary handling

The logic for determining and trimming f-string boundaries is clean and correct. The code properly:

Captures the outermost f-string boundaries

Trims the first line from the start position

Trims the last line to the end position

Joins all lines in the correct order

146-154: Correct f-string nesting handling

The logic for tracking nested f-strings using a counter is well-implemented and correctly handles the Python 3.12+ tokenization behavior.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

src/snakemake/parser.py (3)
121-138: Consider adding edge case handling for malformed f-strings.

While the implementation correctly handles well-formed f-strings, consider adding defensive checks for edge cases where tokens might be malformed or incomplete.

For example, you could add validation after line 138:
         # remainder of the line after the newline character
         token_string = token_string[end_of_line_index:]
         i += 1
+
+    # Ensure we have content for the token's end line
+    if token_.end[0] not in lines_contents:
+        lines_contents[token_.end[0]] = ""
153-153: Clarify the comment about the t1 variable.

The inline comment could be more explicit about what t1 represents at this point in the code.
-        f_string_end = t1.end # t1 is the FSTRING_END token closing the outermost f-string
+        f_string_end = t1.end  # t1 is the FSTRING_END token closing the outermost f-string
111-163: Consider adding a test case for the multiline f-string fix.

The PR checklist indicates that no test case was added. Given that this fixes a specific bug (#3700), it would be valuable to add a regression test to ensure the issue doesn't reoccur.

Would you like me to help generate a test case that verifies multiline f-strings with literal lines are correctly preserved? This would prevent regression of issue #3700.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3a62574 and 0df27e2.

📒 Files selected for processing (1)

src/snakemake/parser.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

⚙️ CodeRabbit Configuration File

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

src/snakemake/parser.py

🧬 Code Graph Analysis (1)

src/snakemake/parser.py (2)

src/snakemake/rules.py (1)

snakefile (142-143)

src/snakemake/workflow.py (1)

snakefile (1539-1544)

🔇 Additional comments (2)

src/snakemake/parser.py (2)

111-163: LGTM! The fix correctly handles multiline f-strings.

The rewritten parse_fstring method properly collects all lines spanned by the f-string tokens and preserves literal lines without expressions. The approach of using a line-by-line dictionary to accumulate content and then trimming only the boundaries is sound and directly addresses the issue reported in #3700.

143-149: Ensure nested f-strings are tested under Python 3.12

I didn’t find any existing test coverage for nested f-strings. To be confident that the isin_fstring counter handles all the new parsing rules in Python 3.12, please:

Add dedicated tests (e.g., in tests/test_parser.py) covering complex nested f-string scenarios:
• f-strings inside f-strings with conversions (!r, !s) and format specifiers
• Multiple nesting levels and escaped braces

Verify that collect_lines still correctly captures inner expressions and that isin_fstring increments/decrements as expected

Run the full test suite on Python 3.12 to confirm no regressions

johanneskoester

Thanks a lot!! Can you add a testcase that failed before and works now?

…relevant syntax

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (5)

src/snakemake/parser.py (4)
78-91: Helper looks correct; consider minor doc tweak.

split_token_lines() correctly maps physical line numbers to their full line text with newlines preserved. No functional concerns. As a tiny nit, consider avoiding hard-coded token-type integers in the example to keep the doc resilient across Python versions.

136-160: Remove dead code: lines_contents and collect_lines() are unused.

This block builds lines_contents but it’s never read when constructing t. It adds complexity without effect and may confuse future maintainers.

Apply this diff to drop the unused logic:
-        lines_contents = dict()
-
-        def collect_lines(token_: tokenize.TokenInfo):
-            # iterate over all lines that the token spans
-            token_string = token_.line
-            i = token_.start[0]
-            while i <= token_.end[0]:
-                end_of_line_index = (
-                    (token_string.index("\n") + 1)
-                    if "\n" in token_string
-                    else len(token_string)
-                )
-                line_content = token_string[:end_of_line_index]
-
-                # don't take the new line content if the one already stored is longer
-                if i not in lines_contents or len(lines_contents[i]) < len(
-                    line_content
-                ):
-                    lines_contents[i] = line_content
-
-                # remainder of the line after the newline character
-                token_string = token_string[end_of_line_index:]
-                i += 1
-
-        # Collect lines for the first and subsequent tokens that are part of the f-string
-        collect_lines(token)
162-162: Remove unused variable related_lines.

Static analysis is right: this local is never used.

Apply this diff:
-        related_lines = token.start[0]
166-167: Make end-index computation explicit to avoid negative slicing subtlety.

t1.end[1] - len(t1.line) relies on negative indexing semantics against s, which is correct but non-obvious. Computing the absolute end offset improves clarity and reduces off-by-one risks during future edits. While here, you can safely update the lines map on every iteration; values for the same physical line are identical across tokens (TokenInfo.line is the full physical line), so overriding is harmless and simplifies control flow.

Apply this diff:
-            if t1.end[0] not in lines:
-                lines.update(split_token_lines(t1))
+            # Merge the physical lines covered by this token. Overwrites are safe because
+            # TokenInfo.line holds the entire physical line for every token on that line.
+            lines.update(split_token_lines(t1))
@@
-        s = "".join(lines[i] for i in sorted(lines))
-        t = s[token.start[1] : t1.end[1] - len(t1.line)]
+        s = "".join(lines[i] for i in sorted(lines))
+        # Map the tokenizer's end column back into the concatenated buffer 's'.
+        end_in_s = len(s) - len(t1.line) + t1.end[1]
+        t = s[token.start[1] : end_in_s]
Also applies to: 175-179
tests/test_fstring/Snakefile (1)
28-38: Test update captures the regression well; suggest adding the exact minimal reproducer from #3700.

The adjusted expectation verifies that literal lines adjacent to expression braces are preserved. To lock in the specific issue (#3700: dropping a pure-literal middle line), consider adding this assertion as well.

Proposed addition:
@@
 
+# Regression for issue #3700: preserve pure-literal middle line in multi-line f-strings
+assert (
+    f"""abc
+def
+{1+1}"""
+    == """abc
+def
+2"""
+)
I can push this as a follow-up commit if you’d like.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 0df27e2 and 117c6f0.

📒 Files selected for processing (2)

src/snakemake/parser.py (2 hunks)
tests/test_fstring/Snakefile (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

src/snakemake/parser.py

🧠 Learnings (1)

📚 Learning: 2024-10-06T14:09:54.370Z

Learnt from: johanneskoester
PR: snakemake/snakemake#3117
File: tests/test_wrapper/Snakefile:11-11
Timestamp: 2024-10-06T14:09:54.370Z
Learning: Changes made within test cases, such as in `tests/test_wrapper/Snakefile`, are for testing purposes and do not require updates to the project documentation.

Applied to files:

tests/test_fstring/Snakefile

🪛 Ruff (0.12.2)

src/snakemake/parser.py

162-162: Local variable related_lines is assigned to but never used

Remove assignment to unused variable related_lines

(F841)

🔇 Additional comments (1)

src/snakemake/parser.py (1)

187-189: Correctness check: placement of parse_fstring() call is appropriate.

Calling parse_fstring() when encountering FSTRING_START and then replacing the last cmd tuple ensures AbstractCmd consumers receive the full f-string text as a single unit. This is key for preserving multi-line literal segments (the bug fix target).

If you want to double-check across a range of f-string shapes (single-line, nested, trailing text), I can provide a small corpus and a script to diff the pre/post parser outputs.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/snakemake/parser.py (1)
139-140: Consider validating the safety comment.

The commented safety check suggests checking all tokens might be necessary. While the current approach works for the reported issue, edge cases with complex nested f-strings might benefit from this validation.

Consider uncommenting the safety check or documenting why it's not needed:
-            # if t1.end[0] not in lines:  for safety, check all tokens
-            lines.update(split_token_lines(t1))
+            # Update lines dict with all tokens in the f-string to ensure
+            # we capture all line content, even from overlapping tokens
+            lines.update(split_token_lines(t1))

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 117c6f0 and 12e4728.

📒 Files selected for processing (2)

src/snakemake/parser.py (2 hunks)
tests/test_fstring/Snakefile (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tests/test_fstring/Snakefile

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

src/snakemake/parser.py

🔇 Additional comments (5)

src/snakemake/parser.py (5)
78-91: LGTM! Well-documented helper function for token line splitting.

The new split_token_lines function properly handles multiline tokens by returning line number and content pairs. The docstring with the concrete example clearly illustrates the problem being solved.

136-136: Good use of dict for efficient line aggregation.

Converting the split lines to a dictionary ensures unique line numbers are maintained, which is crucial for handling overlapping tokens in f-strings.

126-152: Fix successfully addresses the multiline f-string issue.

The refactored parse_fstring method now properly aggregates all lines from f-string tokens, which correctly preserves literal lines that don't contain expressions. This directly addresses issue #3700 where the middle literal line "def" was being omitted.

The approach of collecting all unique lines across the f-string token sequence and then reconstructing the content is sound.

148-149: Line trimming calculation validated and correct
I’ve reviewed the Python 3.12 tokenize behavior and our existing multiline f-string tests (e.g. tests/test_fstring/Snakefile covers pure-literal middle lines, embedded expressions, conditional parts, and curly-brace escaping). The slice
t = s[token.start[1] : t1.end[1] - len(t1.line)]
correctly uses the start column of the FSTRING_START token and the length of the FSTRING_END token’s .line (which includes the closing quotes and trailing newline) to trim indentation and remove only the closing quotes/newline—preserving the entire f-string literal as intended. No changes needed here.

149-149: Confirm slice logic for f-strings under Python 3.12+

parse_fstring is only invoked when running on Python ≥ 3.12 (guarded by is_fstring_start), so your current tests in an earlier interpreter won’t exercise the FSTRING_START/FSTRING_END paths. Please re-run the token-position script in a Python 3.12+ environment, for example:
python3.12 - << 'EOF'
import sys, tokenize, io

test_cases = [
    'f"""line1\nline2\n{1+1}"""',
    "f'''abc\ndef\n{1+1}'''",
    'f"simple {x}"',
]

for s in test_cases:
    print(f"\nTest case: {repr(s)}")
    for tok in tokenize.generate_tokens(io.StringIO(s).readline):
        if tok.type in (tokenize.FSTRING_START, tokenize.FSTRING_MIDDLE, tokenize.FSTRING_END):
            print(f"  {tokenize.tok_name[tok.type]} start={tok.start} end={tok.end} line={repr(tok.line)}")
EOF
Once you have the true positions, verify that for both single- and multi-line f-strings:

The slice end t1.end[1] - len(t1.line) (a negative offset) correctly trims only the closing quotes, without dropping or including extra characters.

No off-by-one errors arise from trailing newlines or incomplete tok.line contents.

If you find that the negative offset approach is unreliable, you can compute an absolute end index instead:
-        t = s[token.start[1] : t1.end[1] - len(t1.line)]
+        # prefix_len = length of the first line of the f-string
+        prefix_len = len(token.line)
+        # abs_end    = prefix_len + column index of the closing quote
+        abs_end = prefix_len + t1.end[1]
+        t = s[token.start[1] : abs_end]

Hocnonsense · 2025-08-27T23:01:31Z

can you open a PR for this and tag me in it?

@cademirch I just push the commits here, fell free to edit~

johanneskoester · 2025-09-05T06:21:01Z

Thanks, guys!

🤖 I have created a release *beep* *boop* --- ## [9.11.0](v9.10.1...v9.11.0) (2025-09-05) ### Features * add landing page to the report containing custom metadata defined with a YTE yaml template ([#3452](#3452)) ([c754d80](c754d80)) * Issue a warning when unsupported characters are used in wildcard names ([#1587](#1587)) ([fa57355](fa57355)) ### Bug Fixes * avoid checking output files in immediate-submit mode ([#3569](#3569)) ([58c42cf](58c42cf)) * clarify --keep-going flag help text to distinguish runtime vs DAG construction errors ([#3705](#3705)) ([a3a8ef4](a3a8ef4)) * enable docstring assignment in `use rule ... with:` block ([#3710](#3710)) ([2191962](2191962)) * fix missing attribute error in greedy scheduler settings when using touch, dryrun or immediate submit. ([6471004](6471004)) * only trigger script with CODE label ([#3707](#3707)) ([2d01f8c](2d01f8c)) * parser.py incorrectly parsing multiline f-strings ([#3701](#3701)) ([06ece76](06ece76)) * parsing ordinary yaml strings ([#3506](#3506)) ([ddf334e](ddf334e)) * remove temp files when using checkpoints ([#3716](#3716)) ([5ff3e20](5ff3e20)) * Restore python rules changes triggering reruns. (GH: [#3213](#3213)) ([#3485](#3485)) ([2f663be](2f663be)) * unit testing ([#3680](#3680)) ([06ba7e6](06ba7e6)) * use queue_input_wait_time when updating queue input jobs ([#3712](#3712)) ([a945a19](a945a19)) ### Documentation * output files within output directories is an error ([#2848](#2848)) ([#2913](#2913)) ([37272e5](37272e5)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

Fix multiline F-Strings removing lines fixes snakemake#3700  ### QC  * [x] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [x] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).  ## Summary by CodeRabbit - Bug Fixes - Correctly parses multi-line f-strings (Python 3.12+), preventing crashes, truncation, and misaligned spans. - Preserves pure-literal middle lines in multi-line f-strings and handles complex/nested cases more reliably. - Refactor - Reworked f-string parsing to aggregate content across full token spans for more accurate line handling without changing public APIs. - Tests - Added regression test for preserving literal middle lines in multi-line f-strings (issue snakemake#3700) and updated expectations.  --------- Co-authored-by: hwrn <hwrn.aou@sjtu.edu.cn> Co-authored-by: Aoran Hu <48747984+Hocnonsense@users.noreply.github.com>

🤖 I have created a release *beep* *boop* --- ## [9.11.0](snakemake/snakemake@v9.10.1...v9.11.0) (2025-09-05) ### Features * add landing page to the report containing custom metadata defined with a YTE yaml template ([snakemake#3452](snakemake#3452)) ([c754d80](snakemake@c754d80)) * Issue a warning when unsupported characters are used in wildcard names ([snakemake#1587](snakemake#1587)) ([fa57355](snakemake@fa57355)) ### Bug Fixes * avoid checking output files in immediate-submit mode ([snakemake#3569](snakemake#3569)) ([58c42cf](snakemake@58c42cf)) * clarify --keep-going flag help text to distinguish runtime vs DAG construction errors ([snakemake#3705](snakemake#3705)) ([a3a8ef4](snakemake@a3a8ef4)) * enable docstring assignment in `use rule ... with:` block ([snakemake#3710](snakemake#3710)) ([2191962](snakemake@2191962)) * fix missing attribute error in greedy scheduler settings when using touch, dryrun or immediate submit. ([6471004](snakemake@6471004)) * only trigger script with CODE label ([snakemake#3707](snakemake#3707)) ([2d01f8c](snakemake@2d01f8c)) * parser.py incorrectly parsing multiline f-strings ([snakemake#3701](snakemake#3701)) ([06ece76](snakemake@06ece76)) * parsing ordinary yaml strings ([snakemake#3506](snakemake#3506)) ([ddf334e](snakemake@ddf334e)) * remove temp files when using checkpoints ([snakemake#3716](snakemake#3716)) ([5ff3e20](snakemake@5ff3e20)) * Restore python rules changes triggering reruns. (GH: [snakemake#3213](snakemake#3213)) ([snakemake#3485](snakemake#3485)) ([2f663be](snakemake@2f663be)) * unit testing ([snakemake#3680](snakemake#3680)) ([06ba7e6](snakemake@06ba7e6)) * use queue_input_wait_time when updating queue input jobs ([snakemake#3712](snakemake#3712)) ([a945a19](snakemake@a945a19)) ### Documentation * output files within output directories is an error ([snakemake#2848](snakemake#2848)) ([snakemake#2913](snakemake#2913)) ([37272e5](snakemake@37272e5)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

Update parser.py

3a62574

Fix multiline F-Strings removing lines fixes snakemake#3700

ChristianRomberg requested a review from johanneskoester as a code owner August 17, 2025 12:35

ChristianRomberg changed the title ~~Update parser.py~~ fix: parser.py incorrectly parsing multiline f-strings Aug 17, 2025

coderabbitai bot reviewed Aug 17, 2025

View reviewed changes

Update parser.py

0df27e2

coderabbitai bot reviewed Aug 17, 2025

View reviewed changes

johanneskoester requested changes Aug 18, 2025

View reviewed changes

Hocnonsense added 2 commits August 26, 2025 20:13

fix: Multiline f-strings strip lines that don't contain variables or …

c911222

…relevant syntax

Merge branch 'fix-fstring-missing-middle' into pr-3701

117c6f0

coderabbitai bot reviewed Aug 26, 2025

View reviewed changes

fix: coderabbitai

12e4728

coderabbitai bot reviewed Aug 26, 2025

View reviewed changes

ChristianRomberg requested a review from johanneskoester August 26, 2025 13:03

johanneskoester approved these changes Sep 4, 2025

View reviewed changes

Merge branch 'main' into patch-1

a169196

johanneskoester requested a review from cademirch September 4, 2025 18:55

johanneskoester merged commit 06ece76 into snakemake:main Sep 5, 2025
79 checks passed

snakemake-bot mentioned this pull request Sep 5, 2025

chore(main): release 9.11.0 #3722

Merged

Hocnonsense mentioned this pull request Oct 13, 2025

Lines without curly brackets in triple-quoted f-strings vanish #2995

Closed

Conversation

ChristianRomberg commented Aug 17, 2025 • edited by Hocnonsense Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

QC

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Assessment against linked issues

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

ChristianRomberg commented Aug 17, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

johanneskoester left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Hocnonsense commented Aug 27, 2025

Uh oh!

Uh oh!

johanneskoester commented Sep 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ChristianRomberg commented Aug 17, 2025 •

edited by Hocnonsense

Loading

coderabbitai bot commented Aug 17, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)