[refurb] Fix FURB101 and FURB103 false positives when I/O variable is used later#23542
[refurb] Fix FURB101 and FURB103 false positives when I/O variable is used later#23542amyreese merged 20 commits intoastral-sh:mainfrom
refurb] Fix FURB101 and FURB103 false positives when I/O variable is used later#23542Conversation
|
| code | total | + violation | - violation | + fix | - fix |
|---|---|---|---|---|---|
| FURB101 | 6 | 3 | 3 | 0 | 0 |
| FURB103 | 5 | 2 | 3 | 0 | 0 |
refurb] Fix FURB101, FURB103[refurb] Fix FURB101 and FURB103 false positives when I/O variable is used later
[refurb] Fix FURB101 and FURB103 false positives when I/O variable is used laterrefurb] Fix FURB101 and FURB103 false positives when I/O variable is used later
This comment was marked as resolved.
This comment was marked as resolved.
amyreese
left a comment
There was a problem hiding this comment.
Thanks for working on this. Overall I think this is a very large amount of code. Is all of this actually necessary just to see if a variable is used somewhere else? Can we skip checking if it's rebound elsewhere? Erring on the side of a false negative would generally be better than false positives, and if that could greatly reduce the amount of logic necessary, that may be worth it.
| let following_statements = following_statements_after_with( | ||
| with, | ||
| checker.semantic().current_statement_parent(), | ||
| checker.python_ast(), | ||
| ); | ||
| let candidates = find_file_opens( |
There was a problem hiding this comment.
Rather than doing up-front work to find all the statements for every with block, even if it has nothing to do with opening a file, this should first see if there are any candidates, and only then fetch the following statements and filter the list of candidates afterwards.
Or at the very least, the call to get following statements should happen further inside of say, resolve_file_open where the data will actually be used, at the point that it will actually be needed and no sooner.
|
Also, looking at the ecosystem results, it seems we're already pushing towards false negatives here, so I'm not sure how much this is actually benefiting the rule. Were there any false positives in the ecosystem report that should have been resolved with this PR? |
I looked at the changes in the ecosystem from the bot's post, none of these changes are false negatives, since the variables below are used in f.name or f.is_file() |
maybe I'm missing something, but with open(file_path, "w") as f:
f.write("hello world")
...
files_in_repo = {f.name for f in bundle.path.iterdir() if f.is_file()}note the |
Yes, you were right, when I looked at these bot ecosystem results, I didn't pay attention |
|
Overall I'm still concerned about the amount of code here, especially when it resolves only a single false positive in the ecosystem report. Would prefer that this is simplified, and lean more on the existing features of the semantic model rather than reimplementing its own version of semantic model and control flow. Brent suggested that |
I will try to reduce the amount of code, but ecosystem analysis projects do not involve all Python code, so I think there is more than one case. |
|
I was mostly surprised that we need a new |
Hmm, I didn't even know about that. It seems more logical now. Thank you very much. |
amyreese
left a comment
There was a problem hiding this comment.
Looks much better to me, but I'm not quite familiar with the analysis bits.
|
Will wait for @ntBre to take a look. |
ntBre
left a comment
There was a problem hiding this comment.
Thank you! Yes, this makes sense to me. I'm slightly wary of deferring too many checks, especially since I made a similar suggestion twice in quick succession after not using it much before, but this does seem better than trying to restructure the rules to fit into deferred_scopes. I also don't see any negative perf effects of the deferral, which makes sense to me given how lightweight the snapshots are.
I think this is fine as-is, but if we defer any more types of statements, it probably makes sense to have a single deferred_statements with a match on the statement kind instead of adding more deferred_for_loops, deferred_with_statements type functions and fields.
Summary
Fix #21483
Test Plan