[flake8-simplify] Implementation for split-of-static-string (SIM905)#14008
Conversation
721fd96 to
05e0a1a
Compare
|
| code | total | + violation | - violation | + fix | - fix |
|---|---|---|---|---|---|
| SIM905 | 31 | 31 | 0 | 0 | 0 |
MichaReiser
left a comment
There was a problem hiding this comment.
Nice! This is real good.
I think this mainly needs more tests .
The touch with maxsplit is nice but I haven't seen a single example in the ecosystem results. That's why I'm not sure if its worth the complexity or if we should just bail on providing a fix in this case.
| """ | ||
| itemA | ||
| itemB | ||
| itemC | ||
| """.split() |
There was a problem hiding this comment.
A few more examples would be great:
- Where the left is an empty string:
"".split() - An all whitespace string:
" ".split()ideally where the whitespace is a mixture of spaces, tabs, etc - Where the left contains no split points:
"/abc/".split()(found in the ecosystem results) - Multiline with different indents: see https://github.com/scikit-build/scikit-build-core/blob/f6e60f41e46ce13ddbd344542d1bf45743b74514/tests/test_skbuild_settings.py#L440-L443
- Strings with unicode flag
- Examples with comments in various positions
("a,b,c" # comment .split() )
We should also add examples for the following strings where I think it's okay if we don't support them
- Raw strings
- Implicit concatenated strings
- Implicit concatenated strings with commented parts (where there are comments between the parts)
- Implicit concatenated strings with different prefixes
There was a problem hiding this comment.
Added these cases, as well as maxsize=-1 that I found people also use.
There was a problem hiding this comment.
Implicit concatenated strings are supported, but the fix is unsafe because of the comments. I could also exclude them from fixing and mark the fix as safe if you prefer.
crates/ruff_linter/src/rules/ruff/rules/split_of_static_string.rs
Outdated
Show resolved
Hide resolved
crates/ruff_linter/src/rules/ruff/rules/split_of_static_string.rs
Outdated
Show resolved
Hide resolved
crates/ruff_linter/src/rules/ruff/rules/split_of_static_string.rs
Outdated
Show resolved
Hide resolved
| if let Some(ref replacement_expr) = split_replacement { | ||
| // Construct replacement list | ||
| let replacement = checker.generator().expr(replacement_expr); | ||
| diagnostic.set_fix(Fix::unsafe_edit(Edit::range_replacement( |
There was a problem hiding this comment.
What's your reasoning for making this an unsafe edit?
There was a problem hiding this comment.
I had not yet exhausted all edge cases (comments, implicit string concatenation, etc.), so I marked it unsafe to be on the safe side. Comments within ISC are not preserved, so in that case it's unsafe.
| // Autofix for maxsplit without separator not yet implemented | ||
| None |
There was a problem hiding this comment.
What makes maxsplit with the default separator difficult to implement?
There was a problem hiding this comment.
There is no equivalent to split for split_whitespace. Implementing it is not that much work, but it would be simplified greatly when this experimental API is stabilised (https://doc.rust-lang.org/std/str/struct.SplitWhitespace.html#method.remainder). I don't think it's worth to implement the logic now. I'll add a comment.
AlexWaygood
left a comment
There was a problem hiding this comment.
This rule exists in the Flake8-simplify linter as SIM905: MartinThoma/flake8-simplify#86
So I think it would be good for us to put it in our Flake8-simplify category with code SIM905
|
Thanks for the review @MichaReiser, will make another pass. Good catch @AlexWaygood, I'll remap to that category. |
ruff] Implementation for split-of-static-string (RUF035)flake8-simplify] Implementation for split-of-static-string (SIM905)
| } | ||
|
|
||
| fn split_sep(str_value: &str, sep_value: &str, max_split: usize, direction_left: bool) -> Expr { | ||
| let list_items: Vec<&str> = if direction_left && max_split > 0 { |
There was a problem hiding this comment.
The touch with maxsplit is nice but I haven't seen a single example in the ecosystem results. That's why I'm not sure if it's worth the complexity or if we should just bail on providing a fix in this case.
Maxsplit=1 is quite common (arguably people should use str.partition).
Another reason to keep the logic is that it can be useful to generalise for this pylint rule:
https://pylint.readthedocs.io/en/stable/user_guide/messages/convention/use-maxsplit-arg.html
MichaReiser
left a comment
There was a problem hiding this comment.
This is awesome. Thank you. It's up to you if you want to address any of the two comments. I merge whenever you tell me the PR's good to go :)
| op: UnaryOp::USub, | ||
| operand, | ||
| .. | ||
| }) if matches!(**operand, Expr::NumberLiteral { .. }) => 0, |
There was a problem hiding this comment.
I think we should bail here if maxplit is a negative number other than -1
There was a problem hiding this comment.
It's rare, but maxsplit=-2 behaves as if maxsplit is omitted.
From that perspective, any negative number could be flagged as non-idiomatic default, but that would be a different rule.
Working code in the wild:
(This is valid code, although likely a mistake)
|
@MichaReiser processed the comments, thanks for the pointers |
There was a problem hiding this comment.
Could we add a test for maxsplit=0. I suspect that we're handling that incorrectly to be the same as maxsplit=-1 where it isn't.
>>> "a,b,c".split(',', maxsplit=0)
['a,b,c']
>>> "a,b,c".split(',', maxsplit=-1)
['a', 'b', 'c']
>>> "a,b,c".split(',', maxsplit=-2)- negative values => Same as
maxsplitNone - 0 => No split
|
To make things more interesting: |
|
You're right, the |
2901dff to
56047ff
Compare
|
I decided to make it safe unless there are comments in the range (in which case, we still fix, but as unsafe). |
Summary
Closes #13944
Test Plan
Standard snapshot testing
flake8-simplify surprisingly only has a single test case