Skip to content

Enforce ruff/refurb rules (FURB)#12144

Merged
jacobtomlinson merged 6 commits intodask:mainfrom
DimitriPapadopoulos:FURB
Nov 19, 2025
Merged

Enforce ruff/refurb rules (FURB)#12144
jacobtomlinson merged 6 commits intodask:mainfrom
DimitriPapadopoulos:FURB

Conversation

@DimitriPapadopoulos
Copy link
Copy Markdown
Contributor

  • Closes #xxxx
  • Tests added / passed
  • Passes pre-commit run --all-files

@make_array_nonempty.register(DecimalDtype)
def _(dtype):
return DecimalArray._from_sequence([Decimal("0"), Decimal("NaN")], dtype=dtype)
return DecimalArray._from_sequence([Decimal(0), Decimal("NaN")], dtype=dtype)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this rule. Perhaps consistency between integers and floats is more important.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Nov 8, 2025

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

      9 files  ±0        9 suites  ±0   3h 6m 20s ⏱️ - 10m 13s
 18 154 tests ±0   16 937 ✅  - 2   1 215 💤 ±0  2 ❌ +2 
162 614 runs  ±0  150 517 ✅  - 5  12 095 💤 +3  2 ❌ +2 

For more details on these failures, see this check.

Results for commit 7dc6abd. ± Comparison against base commit cd5e429.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this rule is adding value. I would argue in some cases it is making code less readable. I think we should skip this one.

@DimitriPapadopoulos
Copy link
Copy Markdown
Contributor Author

DimitriPapadopoulos commented Nov 10, 2025

Agreed, now ignoring rule verbose-decimal-constructor (FURB157).

@jacobtomlinson
Copy link
Copy Markdown
Member

Sorry I mean this whole ruleset. I find many of the changes in this PR subjectively worse. We like simple code here and I don't think having extra linter rules that push us towards these patterns aligns with that.

@DimitriPapadopoulos
Copy link
Copy Markdown
Contributor Author

DimitriPapadopoulos commented Nov 10, 2025

While I agree SIM and FURB rules are the most debatable, I find the rules applied here result in much simpler code, in the spirit of https://matthewrocklin.com/write-dumb-code.html:

Not 100% certain about this one:

  • for-loop-writes (FURB122) (I do find the new code simple to understand, but agree that novice programmers might not be aware of writelines).

@jacobtomlinson
Copy link
Copy Markdown
Member

I think FURB116 and FURB122 were the ones that I felt made things less readable for novice developers.

@DimitriPapadopoulos
Copy link
Copy Markdown
Contributor Author

DimitriPapadopoulos commented Nov 11, 2025

  • As I see it, FURB116 is about using standard Python string formatting instead of reinventing the wheel. I'm not saying Python string formatting is easy to grasp for novice Python programmers, but it is standard Python and learning string formatting seems a necessity any way. Besides, the resulting code is much simpler. Finally, string formatting is used elsewhere in dask, for example:
    index = [f"row{i:03d}" for i in range(100)]
  • I wasn't certain about FURB122. On second thoughts, it's simply a suggestion to use a standard Python function, in the exact context this function has been designed to be used in. The purpose of the function is easy to understand because its name is simple English. Here the rule is applied to a test, but it could make a difference in I/O bound code (otherwise the function would probably not have been added to Python).

@DimitriPapadopoulos
Copy link
Copy Markdown
Contributor Author

DimitriPapadopoulos commented Nov 15, 2025

@jacobtomlinson I have further simplified the FURB116 fix, I think it's much cleaner now. I can ignore FURB122 if you really don't like this rule.

Copy link
Copy Markdown
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think using writelines makes the code any better or worse. There is no perf gain, no real readability gain and it has cost both our time to write and review. As I've said in other PRs I'm happy to merge it because you've already done it, but would caution that these changes aren't really worth the effort.

FURB116 Replace `hex` call with f-string
FURB122 Use of `f.write` in a for loop
FURB136 Replace `x if x > y else y` with `max(y, x)`
FURB163 Prefer `math.log2(N / group_size)` over `math.log` with a redundant base
FURB187 Use of assignment of `reversed` on list
@DimitriPapadopoulos
Copy link
Copy Markdown
Contributor Author

DimitriPapadopoulos commented Nov 17, 2025

My understanding is that writelines is about perf gain, as it involves less system calls. Otherwise, why would the function have been added to Python? See for example Writing to a File from the Python Cookbook:

Calling writelines is much faster than either joining the strings into one big string (e.g., with ''.join) and then calling write, or calling write repeatedly in a loop.

I must admit the documentation of FURB122 is not clear about the perf gain, but that's probably a documentation bug:

When writing a batch of elements, it's more idiomatic to use a single method call to IOBase.writelines, rather than write elements one by one.

@jacobtomlinson jacobtomlinson merged commit e867681 into dask:main Nov 19, 2025
23 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants