Consider unterminated f-strings in FStringRanges#8154
Conversation
|
Current dependencies on/for this PR:
This comment was auto-generated by Graphite. |
7880903 to
c755335
Compare
| if !self.start_locations.is_empty() { | ||
| debug!( | ||
| "Unterminated f-strings detected at: {:?}", | ||
| self.start_locations | ||
| ); | ||
| } |
There was a problem hiding this comment.
I guess this isn't really required because we'd be providing diagnostics for the same.
There was a problem hiding this comment.
I wonnder if it would be useful to store the range with a missing and position instead of omitting it entirely.
There was a problem hiding this comment.
To extend on this: It might even be more correct to simply assume that everything after the fstring start token is inside the fstring if the end token is missing. However, what the correct interpretation is might depend on the specific usage. E.g. adding noqa comments might be a bad idea for unclosed fstring. Rules testing if the range is part of a multiline string could get away by assuming the string goes to the end of the file
There was a problem hiding this comment.
I thought of the same in one of my proposed solution.
For rules, they're only being used to check W605 (invalid escape sequence) and ISC (implicit string concatenation) which will be fine. For the former, the fix will be created (see below) while for later the fix won't be created.
-f"\.png{
+rf"\.png{The NoQA logic will need to be looked into though. We might have to somehow mark these ranges as "incomplete from source code point of view".
There was a problem hiding this comment.
I'm also fine with adding the entire range, but just omitting the range of an unterminated string seems wrong to me (I rather overestimate than underestimate the f-string range)
There was a problem hiding this comment.
We should probably avoid adding NoQA directives then otherwise it won't be lexed as a comment and the --add-noqa will keep on adding them.
f"\.png
# noqa: W605
# noqa: W605
# ...I'm not sure how to store this info. One solution could be to update the signature from BTreeMap<TextSize, TextRange> to BTreeMap<TextSize, FStringRange> where:
struct FStringRange {
range: TextRange,
// Is this f-string complete i.e., does it have both the start and end tokens?
complete: bool
}Or, while computing the NoQA mapping we can ignore these ranges by somehow checking if they're for a complete f-string or not.
@charliermarsh Any thoughts here?
There was a problem hiding this comment.
If we did change the type to TreeMap<TextSize, FStringRange>, where would we then use that information to avoid adding NOQA?
There was a problem hiding this comment.
Earlier I thought that ignoring such ranges here would work:
ruff/crates/ruff_linter/src/directives.rs
Lines 154 to 171 in 29fb86e
But, I don't think so as then, by default, Ruff will add the directive at the end of the line.
I think we can update this function to return a Option<TextSize> instead:
ruff/crates/ruff_linter/src/noqa.rs
Lines 746 to 762 in 29fb86e
which will be used here and continue on None:
ruff/crates/ruff_linter/src/noqa.rs
Line 527 in 29fb86e
Note that even if we somehow support this, it'll be impossible to ignore any violation on that line manually. For example,
# Unterminated f-string
f"\.png{x} # noqa: W605Here, the noqa comment is part of the f-string and is not a comment.
There was a problem hiding this comment.
The plan you described here makes sense to me. Thanks for walking me through it.
There was a problem hiding this comment.
I've a local branch with a possible implementation for this but there are a few complexities involved. I'll proceed with merging this and open a new PR with that change instead.
PR Check ResultsEcosystem✅ ecosystem check detected no changes. |
|
Shouldn't unterminated f-strings be caught in the parser? |
Yes, they're being caught. The issue which was highlighted in the linked issue is a debug assertion present in constructing the f-string ranges in the indexer. And, the indexer uses the flattened tokens for that purpose which ignores the error tokens. For the following code snippet, f"{123}We would've panicked but only in the debug build which is what this PR fixes. Actually, the release build could panic as well with: f"\.png${And, running with: $ pipx run "ruff==0.1.1" check --isolated --no-cache --select=W605,E999 ~/playground/ruff/fstring.py --fix
error: Panicked while linting /Users/dhruv/playground/ruff/fstring.py: This indicates a bug in Ruff. If you could open an issue at:
https://github.com/astral-sh/ruff/issues/new?title=%5BLinter%20panic%5D
...with the relevant file contents, the `pyproject.toml` settings, and the following stack trace, we'd be very appreciative!
panicked at 'called `Option::unwrap()` on a `None` value', crates/ruff_linter/src/rules/pycodestyle/rules/invalid_escape_sequence.rs:175:18
Backtrace: 0: _rust_eh_personality
1: _main
2: _rust_eh_personality
3: _rust_eh_personality
4: _rust_eh_personality
5: _rust_eh_personality
6: __rjem_je_witnesses_cleanup
7: __rjem_je_witnesses_cleanup
8: _main
9: _main
10: _main
11: _main
12: _main
13: _main
14: _main
15: _main
16: start
17: start
18: _main |
| (Tok::String { .. }, Tok::FStringStart) => ( | ||
| *a_range, | ||
| indexer.fstring_ranges().innermost(b_range.start()).unwrap(), | ||
| ), | ||
| (Tok::FStringEnd, Tok::String { .. }) => ( | ||
| indexer.fstring_ranges().innermost(a_range.start()).unwrap(), | ||
| *b_range, | ||
| ), | ||
| (Tok::FStringEnd, Tok::FStringStart) => ( | ||
| indexer.fstring_ranges().innermost(a_range.start()).unwrap(), | ||
| indexer.fstring_ranges().innermost(b_range.start()).unwrap(), | ||
| ), |
There was a problem hiding this comment.
These unwrap's aren't safe at all as I thought earlier. Ruff can panic on certain edge cases. It's better to account for the None case.
Sorry, i missed that the index is built during parsing, not after |
|
I believe the Indexer is built before parsing, but not as part of parsing itself. We need to build it regardless of whether the source code is valid, because we do support linting invalid programs (at least for the token-based rules). |
1e980fd to
29fb86e
Compare
MichaReiser
left a comment
There was a problem hiding this comment.
I would prefer if we express the fact that unterminated f-strings are not stored in the return type of innermost rather than just a note but are fine moving forward with this solution.
a7c0258 to
5398c07
Compare
FStringRanges

Summary
This PR removes the
debug_assertionin theIndexerto allow unterminated f-strings. This is mainly a fix in the development build which now matches the release build.The fix is simple: remove the
debug_assertionwhich means that the there could beFStringStartand possiblyFStringMiddletokens without a corresponding f-string range in theIndexer. This means that the code requesting for the f-string index need to account for theNonecase, making the code safer.This also updates the code which queries the
FStringRangesto account for theNonecase. This will happen when theFStringStart/FStringMiddletokens are present but theFStringEndtoken isn't which means that theIndexerwon't contain the range for that f-string.Test Plan
cargo testTaking the following code as an example:
f"{123}This only emits a
FStringStarttoken, but noFStringMiddleorFStringEndtokens.And,
f"\.png${This emits a
FStringStartandFStringMiddletoken, but noFStringEndtoken.fixes: #8065