Skip to content

Fix RUF027 false negative with escape sequences in format specs#23525

Open
FrederickStempfle wants to merge 2 commits intoastral-sh:mainfrom
FrederickStempfle:fix/format-spec-escape-sequences
Open

Fix RUF027 false negative with escape sequences in format specs#23525
FrederickStempfle wants to merge 2 commits intoastral-sh:mainfrom
FrederickStempfle:fix/format-spec-escape-sequences

Conversation

@FrederickStempfle
Copy link

Fixes #23360

RUF027 was extracting format specs as raw source text, so something like "{a:\x64}" would pass \x64 to FormatSpec::parse instead of the decoded d. This meant it wouldn't recognize the string as a valid f-string candidate.

The parser already decodes escape sequences in the AST literal elements, so I switched to using those instead of slicing the source directly.

Added a test case covering this.

let spec_str = if spec.elements.interpolations().next().is_none() {
// Use decoded literal values so escape sequences like `\x64`
// are resolved before parsing.
spec_text = spec.elements.literals().map(|lit| &*lit.value).collect::<String>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is over-fitting on the specific case I included in the issue. How does this PR handle a case like:

f'{1:"\n"}'

I suspect that this is an issue with backslashes in the format specifier not unique to hex escapes. This case obviously causes a runtime error as written, but classes can override __format__ to make this work:

>>> class C:
...     def __format__(self, spec):
...         return str(self) + spec
...
>>> c = C()
>>> print(f"{c:'\n'}")
<__main__.C object at 0x7f3ad2ff4980>'
'
>>>

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Worth clarifying: the fix isn't specific to hex escapes. It decodes all escape sequences by using the AST's already-decoded literal values instead of slicing the raw source so it handles \n, \t, \x64, etc. uniformly.

For the f'{1:"\n"}' case since RUF027 only applies to non-f-strings the relevant scenario would be a regular string like '{c:"\n"}'. With my fix the format spec gets decoded to an actual newline which FormatSpec::parse rejects so the string doesn't get flagged. No false positive.

As for custom __format__ overrides those are hard to handle without type information. A custom spec could be arbitrary so I think it's acceptable to leave that case undetected rather than introduce false positives trying to guess.

That said if you think the right fix is broader (e.g., skipping format spec validation entirely when the spec contains any backslash escape) I'm happy to explore that. The trade-off would be potentially flagging things like "{c:\n}" that aren't actually convertible to f-strings in most real-world code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a false negative not a false positive, so this spec shouldn't be rejected as invalid and RUF027 should be emitted.

If RUF027 wants additional heuristics that would be orthogonal to the issue, which is that these specs should be parsed as valid like in CPython.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Removed the FormatSpec::parse check entirely — since Python doesn't validate spec content at the f-string level and just passes it straight to format, any string is technically a valid spec. The check was stricter than CPython and causing false negatives like the newline case. Added a test for it too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely not what I meant. We use the spec parsing as a way to avoid false positives for the rule, as one of the modified test cases shows. We absolutely should not delete that from RUF027.

I'm somewhat inclined to close this PR because it's not really addressing the issue at all, which doesn't even mention RUF027.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry i missed that. Should the fix just be in FormatSpec::parse itself to decode escape sequences before parsing? And then keep the RUF027 like before?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would check how our actual parser handles format specs and try to bring FormatSpec into alignment with that. Decoding escape sequences before parsing doesn't sound right to me, but again I'd have to check what our parser does.

But yes, I was not picturing any RUF027-specific changes when I filed the issue, although the behavior of FormatSpec obviously affects RUF027.

Python does not validate format spec content at the f-string level; it
delegates that entirely to __format__. Using FormatSpec::parse to gate
whether a string is an f-string candidate was therefore too strict and
caused false negatives for specs with unusual but valid content (e.g.
escape sequences that decode to non-standard characters like newlines).

Remove the check entirely and add a test case covering a newline in a
format spec, which should now correctly trigger RUF027.
@ntBre ntBre added the bug Something isn't working label Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Edge case in FormatSpec::parse

2 participants