[airflow] Implement airflow-task-implicit-multiple-outputs (AIR202)#25152
Conversation
|
| code | total | + violation | - violation | + fix | - fix |
|---|---|---|---|---|---|
| AIR202 | 19 | 19 | 0 | 0 | 0 |
Flags `@task`-decorated functions whose `multiple_outputs` behavior is determined by Airflow's runtime annotation-based inference rather than being set explicitly. Triggers when the decorator omits `multiple_outputs=` and either the return annotation resolves to a `collections.abc.Mapping` subclass or the body returns a dict literal, dict comprehension, or `dict(...)` call. Provides an unsafe autofix that inserts `multiple_outputs=True` (annotation path) or `=False` (body-only path), mirroring Airflow's `_infer_multiple_outputs` logic. Covers `@task` and supported variants (`python`, `virtualenv`, `external_python`, `branch`, `short_circuit`, `docker`, `kubernetes`, `pyspark`, …). Excludes `@task.sensor`, which hardcodes `multiple_outputs=False`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds two negative test cases that confirm dict returns inside nested function definitions and inner class methods are not attributed to the outer `@task`-decorated function, locking the `ReturnStatementVisitor` nesting behavior against future regressions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds three new positive cases: - @task(retries=3): call form with an existing kwarg; fix inserts multiple_outputs=True alongside it via add_argument - multiline @task(\n retries=3,\n): fix preserves indentation and inserts kwarg on the existing kwarg line - @task.short_circuit(trigger_rule="all_done"): variant decorator with call form; fix inserts multiple_outputs=True first Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Renames struct, file, and all wiring from AirflowTaskMultipleOutputsImplicit / task_multiple_outputs_implicit to AirflowTaskImplicitMultipleOutputs / task_implicit_multiple_outputs. "implicit X" puts the qualifier next to the noun it qualifies, matching conventional English naming. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Rename violation field `inferred` → `annotation_is_mapping` for clarity - Inline `SUPPORTED_VARIANTS` allowlist as a `matches!` arm at its sole use site - Short-circuit the function-body traversal when the annotation already proves the return type is a `Mapping` subclass - Detect locally-defined `TypedDict` subclasses (e.g. `class MyTD(TypedDict)`) via `analyze::class::any_qualified_base_class`, so `-> MyTD` annotations now correctly fix to `multiple_outputs=True` Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
395d15f to
0ab23ac
Compare
|
Hey, a bit of a lack of bandwidth these days. The idea definitely looks good to me @Dev-iL, and I discussed it already. let me take a look at the detail as well |
Lee-W
left a comment
There was a problem hiding this comment.
LGTM. One nitpick question
| @task | ||
| def nested_function_returning_dict(): # should NOT flag: dict returned by inner fn, not outer | ||
| def inner(): | ||
| return {"x": 1} | ||
|
|
||
| return inner() |
There was a problem hiding this comment.
@task
def nested_function_returning_dict():
def inner() -> dict:
return {"x": 1}
return inner()What will happen in Airflow if we do this?
There was a problem hiding this comment.
just verified. it's pretty much the same as
@task
def nested_function_returning_dict():
return {"x": 1}
There was a problem hiding this comment.
I think this case was not included since it opens a Pandora's box of recursion. I will consult my design docs in a few h and see what the exact reason behind this decision was.
There was a problem hiding this comment.
I doubt it has a good reason. And I feel we should not handle it either. but yep, exploring it is never a bad thing
There was a problem hiding this comment.
just verified. it's pretty much the same as
@task def nested_function_returning_dict(): return {"x": 1}
Right, I think we don't flag this because we can't tell what the user had in mind. Currently, Airflow doesn't care about the annotation of the inner function. If it did, the inferred value of multiple_outputs would be True. But is that what the user wants? Since the current implementation matches the behavior of Airflow, there's no inference surprise to warn the user about.
I suggest we can recommend users to enable ANN201 on their dags folder, and then this rule will be easily applicable.
Reason 1 - we cannot trust the annotation of the the inner functions
The example contained in the code is likely oversimplified. It is more obvious when we don't return the output of inner directly:
@task
def a():
def inner() -> dict: ...
return {**inner(), "extra": 1} # merged dict: still a Mapping, but no annotation on the merge
@task
def b():
def left() -> dict: ...
def right() -> list: ...
return left() if cond else right() # conditional: one branch dict, one not
@task
def c():
def inner() -> dict: ...
result = inner()
result["computed"] = expensive()
return result # local mutation through a variable
@task
def d():
from .helpers import build_payload # cross-module: annotation lives elsewhere
return build_payload()Handling the above properly is a type-checker's job, not a lint heuristic.
Reason 2 - ReturnStatementVisitor (intentionally) does not descend into nested scopes.
The body-path heuristic walks only the outer function's own return statements; return inner() is an Expr::Call, not a dict literal / dict comprehension / dict(...). There's nothing at the AST level that says the outer task returns a Mapping.
Bottom line - if we want to cover this case in the future, the right place is a type-checker-backed variant (e.g. running under ty).
There was a problem hiding this comment.
I don't really want to cover it... but yep, if we were to do that, ty would be a better chioce
ntBre
left a comment
There was a problem hiding this comment.
Looks good to me, thank you!
airflow] Implement airflow-task-implicit-multiple-outputs (AIR202)airflow] Implement airflow-task-implicit-multiple-outputs (AIR202)
…02`) (astral-sh#25152) > [!IMPORTANT] > Disclosure: this PR description was AI-drafted under my direction and reviewed by me before posting. ## Summary Implements rule `AIR202` (`airflow-task-implicit-multiple-outputs`) that flags `@task`-decorated functions whose `multiple_outputs` behavior is determined by Airflow's runtime inference rather than being set explicitly on the decorator. At runtime, Airflow's [`_infer_multiple_outputs`](https://github.com/apache/airflow/blob/main/task-sdk/src/airflow/sdk/bases/decorator.py) inspects the function's return annotation: if it resolves to a subclass of `collections.abc.Mapping`, the return value is split into one `XCom` per key; otherwise the entire return value is stored as a single `XCom`. This couples typing to `XCom` layout in a non-obvious way — renaming, removing, or refining the return annotation silently changes a DAG's `XCom` behavior. Passing `multiple_outputs=` explicitly makes the author's intent clear and insulates the DAG from future changes to inference. ### What the rule flags ```python from airflow.sdk import task # or equivalent Airflow 2.x import path # AIR202 — annotation in the Mapping family @task def extract() -> dict: return {"x": 1, "y": 2} # AIR202 — body returns a dict literal @task() def transform(): return {"a": 1} # AIR202 — call-form decorator with unrelated kwargs @task(retries=3) def with_retries() -> dict[str, int]: return {"n": 1} # AIR202 — variant decorators with parens @task.short_circuit(trigger_rule="all_done") def gate() -> dict: return {"go": True} ``` ### Suggested fix The fix is **always available** but marked **unsafe**: it inserts the value Airflow would have inferred, preserving runtime behavior at the moment of the rewrite, but the author may have intended a different `XCom` layout and a multi-return function may not always return a dict. ```python @task(multiple_outputs=True) def extract() -> dict: return {"x": 1, "y": 2} @task(multiple_outputs=True) def transform(): return {"a": 1} @task(retries=3, multiple_outputs=True) def with_retries() -> dict[str, int]: return {"n": 1} ``` For the call-form path (`@task(...)` with existing parentheses), the fix uses `add_argument` so existing kwargs and formatting are preserved, including on multiline decorators. For the bare form (`@task`, `@task.branch`), the fix appends `(multiple_outputs=...)` after the decorator expression. ### Detection signals The rule fires when **either** of two independent signals holds and `multiple_outputs` is absent from the decorator: 1. **Annotation signal** — the return annotation resolves to a member of the Mapping family: `dict`, `Dict`, `Mapping`, `MutableMapping`, `OrderedDict`, `DefaultDict`, `Counter`, `ChainMap`, `TypedDict` (and the `collections` / `typing` variants thereof). 2. **Body signal** — the function body contains at least one `return` statement whose value is a dict literal, a dict comprehension, or a `dict(...)` call. `ReturnStatementVisitor` is used so returns inside nested `if` / `for` / `while` / `try` blocks are considered, while returns inside nested function definitions or class bodies are correctly **not** considered. ### What it does NOT flag - Functions where `multiple_outputs=` is already set explicitly (either `True` or `False`). - `@task.sensor`-decorated functions — the sensor decorator hardcodes `multiple_outputs=False`. - Functions whose only dict-returning code lives in a nested function or class method (the outer task is not affected). - Functions whose return annotation is not in the Mapping family and whose body does not return a dict-shaped value. ### Implementation details - Resolves `@task` / `@task.<variant>` via the semantic model, handling both bare (`@task`) and call (`@task(...)`) forms via `map_callable`. The supported variants list is local to the rule: `python`, `virtualenv`, `external_python`, `branch`, `branch_virtualenv`, `branch_external_python`, `short_circuit`, `docker`, `kubernetes`, `pyspark`. - Annotation matching walks subscripts (e.g., `dict[str, int]`, `Mapping[str, Any]`) down to the head and uses `match_typing_expr` / known-module resolution. - Body matching uses `ReturnStatementVisitor` from `ruff_python_ast::helpers`, which already skips nested function and class definitions — so nested-scope dict returns are correctly ignored without a custom traversal. - The inserted value (`True` / `False`) mirrors Airflow's inference: `True` when the annotation is in the Mapping family, `False` otherwise. When both signals fire but the annotation is non-Mapping (e.g., body returns a dict but the annotation is `list`), the inserted value is still `False` to preserve runtime behavior. ## Test Plan Added snapshot tests in `AIR202.py` covering: | Group | Cases | |-------|-------| | Annotation positives | `dict`, `dict[str, int]`, `Dict`, `Mapping`, `MutableMapping`, `OrderedDict`, `DefaultDict`, `Counter`, `ChainMap`, `TypedDict` | | Body positives | dict literal, dict comprehension, `dict(...)` call | | Decorator forms | bare `@task`, `@task()`, `@task(retries=3)`, multiline `@task(...)`, `@task.docker(image=...)`, `@task.virtualenv()`, `@task.short_circuit(trigger_rule=...)`, `@task.branch` | | Explicit-kwarg negatives | `multiple_outputs=True`, `multiple_outputs=False` — never flagged | | Sensor negative | `@task.sensor` returning a dict — never flagged | | Nested-scope negatives | outer task with inner function returning a dict; outer task with inner class whose method returns a dict — never flagged | A sweep against the local Apache Airflow checkout (`cargo run -p ruff -- check ~/airflow --no-cache --preview --select AIR202`) produced 19 diagnostics on the Airflow codebase itself, all of which were verified to be genuine cases of implicit `multiple_outputs` inference (no false positives). Related: - apache/airflow#43176 - apache/airflow#66712 Community approval: https://lists.apache.org/thread/99s321s2p5xqhzlm0x5wjgy044nggtpm --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…02`) (astral-sh#25152) > [!IMPORTANT] > Disclosure: this PR description was AI-drafted under my direction and reviewed by me before posting. ## Summary Implements rule `AIR202` (`airflow-task-implicit-multiple-outputs`) that flags `@task`-decorated functions whose `multiple_outputs` behavior is determined by Airflow's runtime inference rather than being set explicitly on the decorator. At runtime, Airflow's [`_infer_multiple_outputs`](https://github.com/apache/airflow/blob/main/task-sdk/src/airflow/sdk/bases/decorator.py) inspects the function's return annotation: if it resolves to a subclass of `collections.abc.Mapping`, the return value is split into one `XCom` per key; otherwise the entire return value is stored as a single `XCom`. This couples typing to `XCom` layout in a non-obvious way — renaming, removing, or refining the return annotation silently changes a DAG's `XCom` behavior. Passing `multiple_outputs=` explicitly makes the author's intent clear and insulates the DAG from future changes to inference. ### What the rule flags ```python from airflow.sdk import task # or equivalent Airflow 2.x import path # AIR202 — annotation in the Mapping family @task def extract() -> dict: return {"x": 1, "y": 2} # AIR202 — body returns a dict literal @task() def transform(): return {"a": 1} # AIR202 — call-form decorator with unrelated kwargs @task(retries=3) def with_retries() -> dict[str, int]: return {"n": 1} # AIR202 — variant decorators with parens @task.short_circuit(trigger_rule="all_done") def gate() -> dict: return {"go": True} ``` ### Suggested fix The fix is **always available** but marked **unsafe**: it inserts the value Airflow would have inferred, preserving runtime behavior at the moment of the rewrite, but the author may have intended a different `XCom` layout and a multi-return function may not always return a dict. ```python @task(multiple_outputs=True) def extract() -> dict: return {"x": 1, "y": 2} @task(multiple_outputs=True) def transform(): return {"a": 1} @task(retries=3, multiple_outputs=True) def with_retries() -> dict[str, int]: return {"n": 1} ``` For the call-form path (`@task(...)` with existing parentheses), the fix uses `add_argument` so existing kwargs and formatting are preserved, including on multiline decorators. For the bare form (`@task`, `@task.branch`), the fix appends `(multiple_outputs=...)` after the decorator expression. ### Detection signals The rule fires when **either** of two independent signals holds and `multiple_outputs` is absent from the decorator: 1. **Annotation signal** — the return annotation resolves to a member of the Mapping family: `dict`, `Dict`, `Mapping`, `MutableMapping`, `OrderedDict`, `DefaultDict`, `Counter`, `ChainMap`, `TypedDict` (and the `collections` / `typing` variants thereof). 2. **Body signal** — the function body contains at least one `return` statement whose value is a dict literal, a dict comprehension, or a `dict(...)` call. `ReturnStatementVisitor` is used so returns inside nested `if` / `for` / `while` / `try` blocks are considered, while returns inside nested function definitions or class bodies are correctly **not** considered. ### What it does NOT flag - Functions where `multiple_outputs=` is already set explicitly (either `True` or `False`). - `@task.sensor`-decorated functions — the sensor decorator hardcodes `multiple_outputs=False`. - Functions whose only dict-returning code lives in a nested function or class method (the outer task is not affected). - Functions whose return annotation is not in the Mapping family and whose body does not return a dict-shaped value. ### Implementation details - Resolves `@task` / `@task.<variant>` via the semantic model, handling both bare (`@task`) and call (`@task(...)`) forms via `map_callable`. The supported variants list is local to the rule: `python`, `virtualenv`, `external_python`, `branch`, `branch_virtualenv`, `branch_external_python`, `short_circuit`, `docker`, `kubernetes`, `pyspark`. - Annotation matching walks subscripts (e.g., `dict[str, int]`, `Mapping[str, Any]`) down to the head and uses `match_typing_expr` / known-module resolution. - Body matching uses `ReturnStatementVisitor` from `ruff_python_ast::helpers`, which already skips nested function and class definitions — so nested-scope dict returns are correctly ignored without a custom traversal. - The inserted value (`True` / `False`) mirrors Airflow's inference: `True` when the annotation is in the Mapping family, `False` otherwise. When both signals fire but the annotation is non-Mapping (e.g., body returns a dict but the annotation is `list`), the inserted value is still `False` to preserve runtime behavior. ## Test Plan Added snapshot tests in `AIR202.py` covering: | Group | Cases | |-------|-------| | Annotation positives | `dict`, `dict[str, int]`, `Dict`, `Mapping`, `MutableMapping`, `OrderedDict`, `DefaultDict`, `Counter`, `ChainMap`, `TypedDict` | | Body positives | dict literal, dict comprehension, `dict(...)` call | | Decorator forms | bare `@task`, `@task()`, `@task(retries=3)`, multiline `@task(...)`, `@task.docker(image=...)`, `@task.virtualenv()`, `@task.short_circuit(trigger_rule=...)`, `@task.branch` | | Explicit-kwarg negatives | `multiple_outputs=True`, `multiple_outputs=False` — never flagged | | Sensor negative | `@task.sensor` returning a dict — never flagged | | Nested-scope negatives | outer task with inner function returning a dict; outer task with inner class whose method returns a dict — never flagged | A sweep against the local Apache Airflow checkout (`cargo run -p ruff -- check ~/airflow --no-cache --preview --select AIR202`) produced 19 diagnostics on the Airflow codebase itself, all of which were verified to be genuine cases of implicit `multiple_outputs` inference (no false positives). Related: - apache/airflow#43176 - apache/airflow#66712 Community approval: https://lists.apache.org/thread/99s321s2p5xqhzlm0x5wjgy044nggtpm --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Important
Disclosure: this PR description was AI-drafted under my direction and reviewed by me before posting.
Summary
Implements rule
AIR202(airflow-task-implicit-multiple-outputs) that flags@task-decorated functions whosemultiple_outputsbehavior is determined by Airflow's runtime inference rather than being set explicitly on the decorator.At runtime, Airflow's
_infer_multiple_outputsinspects the function's return annotation: if it resolves to a subclass ofcollections.abc.Mapping, the return value is split into oneXComper key; otherwise the entire return value is stored as a singleXCom. This couples typing toXComlayout in a non-obvious way — renaming, removing, or refining the return annotation silently changes a DAG'sXCombehavior. Passingmultiple_outputs=explicitly makes the author's intent clear and insulates the DAG from future changes to inference.What the rule flags
Suggested fix
The fix is always available but marked unsafe: it inserts the value Airflow would have inferred, preserving runtime behavior at the moment of the rewrite, but the author may have intended a different
XComlayout and a multi-return function may not always return a dict.For the call-form path (
@task(...)with existing parentheses), the fix usesadd_argumentso existing kwargs and formatting are preserved, including on multiline decorators. For the bare form (@task,@task.branch), the fix appends(multiple_outputs=...)after the decorator expression.Detection signals
The rule fires when either of two independent signals holds and
multiple_outputsis absent from the decorator:dict,Dict,Mapping,MutableMapping,OrderedDict,DefaultDict,Counter,ChainMap,TypedDict(and thecollections/typingvariants thereof).returnstatement whose value is a dict literal, a dict comprehension, or adict(...)call.ReturnStatementVisitoris used so returns inside nestedif/for/while/tryblocks are considered, while returns inside nested function definitions or class bodies are correctly not considered.What it does NOT flag
multiple_outputs=is already set explicitly (eitherTrueorFalse).@task.sensor-decorated functions — the sensor decorator hardcodesmultiple_outputs=False.Implementation details
@task/@task.<variant>via the semantic model, handling both bare (@task) and call (@task(...)) forms viamap_callable. The supported variants list is local to the rule:python,virtualenv,external_python,branch,branch_virtualenv,branch_external_python,short_circuit,docker,kubernetes,pyspark.dict[str, int],Mapping[str, Any]) down to the head and usesmatch_typing_expr/ known-module resolution.ReturnStatementVisitorfromruff_python_ast::helpers, which already skips nested function and class definitions — so nested-scope dict returns are correctly ignored without a custom traversal.True/False) mirrors Airflow's inference:Truewhen the annotation is in the Mapping family,Falseotherwise. When both signals fire but the annotation is non-Mapping (e.g., body returns a dict but the annotation islist), the inserted value is stillFalseto preserve runtime behavior.Test Plan
Added snapshot tests in
AIR202.pycovering:dict,dict[str, int],Dict,Mapping,MutableMapping,OrderedDict,DefaultDict,Counter,ChainMap,TypedDictdict(...)call@task,@task(),@task(retries=3), multiline@task(...),@task.docker(image=...),@task.virtualenv(),@task.short_circuit(trigger_rule=...),@task.branchmultiple_outputs=True,multiple_outputs=False— never flagged@task.sensorreturning a dict — never flaggedA sweep against the local Apache Airflow checkout (
cargo run -p ruff -- check ~/airflow --no-cache --preview --select AIR202) produced 19 diagnostics on the Airflow codebase itself, all of which were verified to be genuine cases of implicitmultiple_outputsinference (no false positives).Related:
Community approval: https://lists.apache.org/thread/99s321s2p5xqhzlm0x5wjgy044nggtpm