Conversation
PR Check ResultsBenchmarkLinuxWindows |
MichaReiser
left a comment
There was a problem hiding this comment.
Oh wow nice! Excellent work.
What I understand from the code is that the formatted now uses dynamic_text for every number constant even if it is already correctly formatted. This is, unfortunately, somewhat expensive because every dynamic_text allocates a String when writing the text.
Would you be interested in taking a stab at implementing an optimisation similar to what we did in normalize_string
ruff/crates/ruff_python_formatter/src/expression/string.rs
Lines 413 to 481 in 33657d3
The trick is to use a Cow and return Cow::Borrowed if the number is already correctly formatted (in which case we can print the code from the source using source_text_slice (extremely fast). The implementation returns a Cow::Owned if it reformatted the number and it then uses dynamic_text.
The benefit of this optimisation is that we only pay the cost of allocating when formatting the file for the first time. Checking the formatting in subsequent passes (when there are no changes) takes the fast path instead.
Let me know if that's something you're interested in and if so, if you want to tackle this as part of this PR or want to do a follow up instead.
|
|
||
| if content.starts_with("0x") || content.starts_with("0X") { | ||
| let hex = content.get(2..).unwrap().to_ascii_uppercase(); | ||
| write!(f, [text("0x"), dynamic_text(&hex, None)]) |
There was a problem hiding this comment.
We should pass range.start() as the second argument to all dynamic_text usages. The position is used in the generated source map and we'll use the source map for range formatting.
|
I'll implement optimization like |
MichaReiser
left a comment
There was a problem hiding this comment.
Alright. I'll assign this back to you. Feel free to ping me if you have any questions and request another review once your done.
…r is already normalized
| +x = (0B1011).conjugate() | ||
| +x = (0O777).real | ||
| +x = 123456789.123456789j.__add__((0b1011).bit_length()) | ||
| +x = (0xB1ACC).conjugate() |
There was a problem hiding this comment.
now that we have no more trailing dots in floats we can also fix the parentheses rules for them (maybe better in a follow-up PR so that the scope of this PR doesn't get too large)
| for (index, c) in chars { | ||
| if matches!(c, 'e' | 'E') { | ||
| has_exponent = true; | ||
| if input.as_bytes().get(index - 1).copied() == Some(b'.') { |
There was a problem hiding this comment.
I believe this could, at least in theory, result in indexing between two character boundaries if the text starts with [e10]. I don't think this is a problem in practice because it's impossible that the preceding character is a unicode character, because the number would then be part of an identifier. I still recommend fixing this just to be sure (e.g. by assigning `is_float on line 147)
Summary
Format int, float and complex constants.
Test Plan
Existing snapshots.