Skip to content

PLW3301 (nested-min-max) Autofix inappropriately strips nested function call when the nested function call wasn't max #13088

@AustinStarnes

Description

@AustinStarnes

Keywords used searching for existing issues: "nested-min-max", "PLW3301"

Using Ruff VSCode extension v2024.42.0

Click to expand ruff.toml:
line-length = 88
indent-width = 4

[lint]
preview = true
select = ["ALL"]
ignore = [
    "ANN204", 
    "S110",
    "DOC201",
    "DOC501", 
    "D107", 
    "D105", 
    "SIM102", 
    "S404", 
    "PLW1514", 
    "ANN002", 
    "ANN003", 
    "PYI051", 
    "FIX002", 
    "ANN401", 
    "DOC402",
    "B008",
    "FAST002",
    "RET504"
]
Click to expand VSCode User settings.json:
 // ...
    "[python]": {
        "editor.formatOnSave": true,
        "editor.formatOnType": true,
        "editor.codeActionsOnSave": {
            "source.fixAll.ruff": "always",
        },
// ...
    "ruff.lint.args": [
        "--config",
        "/home/austin/.vscode/ruff.toml",
        "--ignore", "INP001,TRY003,EM101,EM102,BLE001,G004,N806,TD002,TD003",
        "--target-version", "py311",
        "--unfixable", "F401"
    ],
// ...
Click to expand the real context in which I encountered the bug, in case I am accused of reporting an issue that isn't encountered in real-world use cases: To get the width with which to print and right-justify a token produced by an LLM, I needed the max length of any decoded token string (plus two for bounding and characters that will be printed around the token).
But, in cases where none of the tokens generated were at least 3 (two less than the length of the column label "token"), this width with which to print and right-justify would be too narrow. So, I wanted the max of not just the token string length (plus two), but the max of the token string length (plus two) and the length of the label "token".
# ...
    token_string_label = "token"
    max_token_char_length = max(
        max(len(tokenizer.decode(tok)) + 2 for tok in generated_tokens[0]),
        len(token_string_label),
    )
    print(
        f"| {token_string_label.rjust(max_token_char_length)} "
        f"| {'token id':>8s} | {'log prob.':>9} | {'prob.':>6}",
    )
    for tok, score in zip(generated_tokens[0], transition_scores[0], strict=False):
        print(
            f"| {('▻' + tokenizer.decode(tok) + '◅').rjust(max_token_char_length)} "
            f"| {tok:8d} | {score.item():9.3f} | {torch.exp(score).item():.2%}",
        )
# ...

Minimal example:

sentence = "Hello world, ruff is cool!"

# The expression triggering rule PLW3301 (nested-min-max):
max_word_len = max(
    max(len(word) for word in sentence.split(" ")),
    len("Done!"),
)
# >>> max_word_len
# 6

# What the autofix expression produced:
max_word_len = max(*(len(word) for word in sentence.split(" ")), *"Done!")
# >>> max_word_len = max(*(len(word) for word in sentence.split(" ")), *"Done!")
# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
# TypeError: '>' not supported between instances of 'str' and 'int'

# If nested-min-max must be autofixed, I would imagine the result should be:
max_word_len = max(*(len(word) for word in sentence.split(" ")), len("Done!"))
# i.e., if an argument to the outer max() is not the result of a call to max(), then it
# should be left as is

I believe issue #5066 also demonstrated the bug, but the issue was closed because the reporter's usage of max wasn't valid either. The issue probably should not have been closed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingfixesRelated to suggested fixes for violations

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions