[ssort] Implement statement sorting#22235
[ssort] Implement statement sorting#22235JackAshwell11 wants to merge 13 commits intoastral-sh:mainfrom
ssort] Implement statement sorting#22235Conversation
bc76fe0 to
223c902
Compare
223c902 to
0662d0d
Compare
ntBre
left a comment
There was a problem hiding this comment.
Thanks! I need to do a much deeper review at some point, just a couple of passing thoughts while scrolling through to approve the workflows.
There was a problem hiding this comment.
Wow that's a lot of new test files! Thanks for being thorough here, but is there any chance we could condense this down a bit? I haven't taken a look at the implementation yet, but something we often do is create outer functions or classes to contain test cases. For example, could we test the same sorting behavior if this file were like:
class C:
def function():
return dependency()
def dependency():
return True
async def async_function():
return function()and then another file could be:
class D:
# another test case
...or do they really need to be separate files?
There was a problem hiding this comment.
Some of them could be compacted, but I'm hesitant to do this unless necessary as, since this rule rearranges files based on the function calls, I'm not sure how many side effects there will be if some of the tests are combined
There was a problem hiding this comment.
I think we already have a file for statement analysis in checkers/ast/analyze:
Could this check go there instead? Or maybe module.rs in the same directory.
There was a problem hiding this comment.
Should be fixed in latest
| /// def method_b(self): return self.method_a() | ||
| /// ``` | ||
| #[derive(ViolationMetadata)] | ||
| #[violation_metadata(preview_since = "v0.14.11")] |
There was a problem hiding this comment.
| #[violation_metadata(preview_since = "v0.14.11")] | |
| #[violation_metadata(preview_since = "0.14.11")] |
We don't use a v in our version numbers anymore.
There was a problem hiding this comment.
Should be fixed in latest
bd01b5b to
00aa687
Compare
Merging this PR will not alter performance
Comparing Footnotes
|
|
| code | total | + violation | - violation | + fix | - fix |
|---|---|---|---|---|---|
| SS001 | 1313 | 1313 | 0 | 0 | 0 |
| if let Expr::Name(value) = &*attr.value { | ||
| if value.id.as_str() == "self" { | ||
| self.names.insert(&attr.attr.id); | ||
| } | ||
| } |
There was a problem hiding this comment.
Don't we end up visiting attr.value twice? Or did you want to return early when seeing an attribute expression?
There was a problem hiding this comment.
This is unintentional, I'll fix it. Thanks 👍
| /// | ||
| /// ## Arguments | ||
| /// * `expr` - The expression node to visit. | ||
| fn visit_expr(&mut self, expr: &'a Expr) { |
There was a problem hiding this comment.
Does this visitor correctly handle scopes? What if two functions define the same variables but they're nested within each other? What about type vars?
There was a problem hiding this comment.
From my understanding, ssort is intentionally scope-agnostic as it only tracks text references, so as long as the text reference matches another function, this should work.
If two functions define the same variables but they're nested then this shouldn't handle they would be considered a separate suite (although I'd argue that is a problem with your code as you shouldn't name two things with the same name).
I'm not sure if type variables will work as I don't think visit_expr() handles annotations, but I could be wrong
There was a problem hiding this comment.
How does ssort ensure that type annotations don't break (specificially in project that don't use from __future__ import annotations)?
You probably want to adjust your visitor so that it doesn't traverse into nested classes or functions.
f3409a9 to
9ba4467
Compare
|
Thank you again for all of your work here! It's going to take me some time to digest this and review, but I have it on my todo list for the month. Happy new year! |
…d resolving a function's dependencies with basic autofix support
…ecomes a `Node` allowing top-level statements, which aren't functions or classes, to impact the dependency ordering This also makes the replacement string construction much easier as we can emit each node in order as it is fully sorted at that point
…matting and use the locator API
…nize_suite()` during string replacement
…rules Added comprehensive Rust tests for functionality not directly covered by the Python tests
- Optimised `DependencyVisitor::visit_expr()` so it doesn't slow Ruff down as much. - Removed some problematic Python test files. - Fixed a bug where `attr.value` is visited twice. - Fixed a bug where type annotations would cause sorting to break. - Fixed all clippy warnings.
9ba4467 to
fb6b12a
Compare
ssort] Implement statement reordingssort] Implement statement sorting
ntBre
left a comment
There was a problem hiding this comment.
Thanks for your work here and for your patience! I still need to wrap my head fully around the graph algorithm, but I did a deeper dive today and came up with some suggestions for simplification. In particular, I think if we can narrow down the (very thorough!) tests and restructure them so that the important features are more evident, it will be a bit easier to review the logical aspects of the PR.
| #[derive(Debug, Clone, CacheKey, Default)] | ||
| pub struct Settings { | ||
| /// Whether to sort statements in narrative order (definitions before usages). | ||
| pub narrative_order: bool, |
There was a problem hiding this comment.
I think I would go ahead and introduce a custom enum here. It sounds like in addition to narrative order and newspaper order there may be some interest in a django order in the future, and I think something like order = "narrative" or order = "newspaper" will read better for now anyway.
| use crate::settings::LinterSettings; | ||
| use crate::test::test_path; | ||
|
|
||
| #[test_case(Path::new("async_function.py"), false)] |
There was a problem hiding this comment.
I might be the odd one out here, but I typically prefer all of the files to have names like RULECODE_0.py. It's slightly less explanatory in the filename but also makes them easier to find when working on the rule later, in my experience.
| /// return self.method_a() | ||
| /// ``` | ||
| #[derive(ViolationMetadata)] | ||
| #[violation_metadata(preview_since = "0.14.11")] |
There was a problem hiding this comment.
We introduced this lately, but you can now use this to keep this updated automatically:
| #[violation_metadata(preview_since = "0.14.11")] | |
| #[violation_metadata(preview_since = "NEXT_RUFF_VERSION")] |
| (Flake8Logging, "015") => rules::flake8_logging::rules::RootLoggerCall, | ||
|
|
||
| // ssort | ||
| (SSort, "001") => rules::ssort::rules::UnsortedStatements, |
There was a problem hiding this comment.
I'd be curious to get other thoughts on this, but I'd probably just make this a RUF rule. We're hoping to move away from having all of these separate linters before long anyway (#1774), and I don't think it's worth adding a new linter prefix for one rule.
| #[test_case(Path::new("top_level_statement.py"), true)] | ||
| #[test_case(Path::new("type_alias.py"), false)] | ||
| #[test_case(Path::new("type_alias.py"), true)] | ||
| fn default(path: &Path, narrative_order: bool) -> Result<()> { |
There was a problem hiding this comment.
An Order type will also read a bit more clearly here than plain bools anyway.
| /// | ||
| /// ## Returns | ||
| /// A vector of nodes representing the statements in the suite. | ||
| pub(super) fn nodes_from_suite(suite: &'_ [Stmt]) -> Vec<Node<'_>> { |
There was a problem hiding this comment.
| pub(super) fn nodes_from_suite(suite: &'_ [Stmt]) -> Vec<Node<'_>> { | |
| pub(super) fn nodes_from_suite(suite: &[Stmt]) -> Vec<Node> { |
Not sure if it will let you elide the second one, but the first one should definitely be okay.
| index, | ||
| name, | ||
| stmt, | ||
| movable: name.is_some(), |
There was a problem hiding this comment.
Do we need to track this field separately if it's just name.is_some()?
| /// Test that `CycleError` displays correctly. | ||
| #[test] | ||
| fn cycle_error_display() { | ||
| let err = CycleError { | ||
| cycle: vec![0, 1, 2], | ||
| }; | ||
| let display = format!("{err}"); | ||
|
|
||
| assert!(display.contains("Cycle detected")); | ||
| assert!(display.contains('0')); | ||
| assert!(display.contains('1')); | ||
| assert!(display.contains('2')); | ||
| } |
There was a problem hiding this comment.
This really seems like overkill to me. Do we render this to users anywhere?
| /// Test that `CycleError` implements `std::error::Error`. | ||
| #[test] | ||
| fn cycle_error_is_error() { | ||
| let err = CycleError { cycle: vec![0] }; | ||
| let _: &dyn std::error::Error = &err; | ||
| } |
There was a problem hiding this comment.
This also seems unnecessary. In fact, it seems that we only use the Error implementation in tests? I would probably just drop the Error and Display impls entirely, along with these tests.
| let source = r#" | ||
| def foo(): | ||
| return 1 | ||
|
|
||
| def bar(): | ||
| return 2 | ||
| "#; | ||
| let parsed = parse_module(source)?; | ||
| let nodes = nodes_from_suite(parsed.suite()); | ||
| let mut visitor = DependencyVisitor::new(&nodes); |
There was a problem hiding this comment.
There seems to be a lot of common code here again. Can we factor out a test helper?
|
Oh I also resolved the conflicts and got the tests passing while I was trying out the code locally, so I pushed that up too. |
Summary
Closes #3946 and #2436
narrative_orderallowing the statement ordering to be reversed meaning the calling functions are defined before the dependency functions.Note that this PR intentionally avoids sorting statements inside classes such as the docstring/special attribute/lifecycle methods/etc which ssort implements (https://github.com/bwhmather/ssort?tab=readme-ov-file#output) as these fall under #2425 which needs further discussion.
Test Plan
I've checked the
ruff-ecosystemresults and each change looks to be correct with this new rule (although there will be many rule violations).