Skip to content

[airflow] Flag Variable.get() calls outside of task execution context (AIR003)#23584

Merged
ntBre merged 1 commit intoastral-sh:mainfrom
Dev-iL:airflow/variable
Mar 11, 2026
Merged

[airflow] Flag Variable.get() calls outside of task execution context (AIR003)#23584
ntBre merged 1 commit intoastral-sh:mainfrom
Dev-iL:airflow/variable

Conversation

@Dev-iL
Copy link
Copy Markdown
Contributor

@Dev-iL Dev-iL commented Feb 26, 2026

Summary

Implements a new rule AIR003 (airflow-variable-get-outside-task) that flags Variable.get() calls outside of task execution context.

Per the Airflow best practices documentation, calling Variable.get() at module level or inside operator constructor arguments causes a database query every time the DAG file is parsed by the scheduler. This can degrade parsing performance and even cause DAG file timeouts. The recommended alternative is to pass variables via Jinja templates ({{ var.value.my_var }}), which defer the lookup until task execution.

What the rule flags

from airflow.sdk import Variable
from airflow.operators.bash import BashOperator

# Top-level Variable.get() — runs on every DAG parse
foo = Variable.get("foo")

# Variable.get() in operator constructor args — also runs on every DAG parse
BashOperator(
    task_id="bad",
    bash_command="echo $FOO",
    env={"FOO": Variable.get("foo")},
)

# Variable.get() in a regular helper function — not a task execution context
def helper():
    return Variable.get("foo")

What it allows

Variable.get() is fine inside code that only runs during task execution:

  • @task-decorated functions (including @task(), @task.branch, @task.short_circuit, etc.)
  • execute() methods on BaseOperator subclasses

Implementation details

  • Resolves Variable.get via the semantic model, matching both airflow.models.Variable and airflow.sdk.Variable import paths.
  • Determines "task execution context" by walking the statement hierarchy (current_statements()) looking for a parent FunctionDef that is either:
    • decorated with @task / @task.<variant> (resolved via airflow.decorators.task), or
    • named execute inside a class inheriting from BaseOperator.
  • Handles both @task and @task() forms via map_callable, and attribute-style decorators like @task.branch via Expr::Attribute resolution.

Test Plan

Added snapshot tests in AIR005.py covering:

  • Violations: Variable.get() at module level, inside operator constructor keyword arguments, inside f-string interpolation in operator args, and in a regular helper function. Tested both airflow.models.Variable and airflow.sdk.Variable import paths.
  • Non-violations: Variable.get() inside @task, @task(), @task.branch decorated functions, inside a BaseOperator.execute() method, and Jinja template usage (no Variable.get() call).

related: apache/airflow#43176

@ntBre ntBre added rule Implementing or modifying a lint rule preview Related to preview mode features labels Feb 26, 2026
@astral-sh-bot
Copy link
Copy Markdown

astral-sh-bot bot commented Feb 26, 2026

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+8 -0 violations, +0 -0 fixes in 1 projects; 55 projects unchanged)

apache/airflow (+8 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --no-fix --output-format concise --preview --select ALL

+ airflow-core/tests/unit/dag_processing/test_processor.py:156:30: AIR003 `Variable.get()` outside of a task
+ airflow-core/tests/unit/dag_processing/test_processor.py:194:30: AIR003 `Variable.get()` outside of a task
+ airflow-core/tests/unit/dag_processing/test_processor.py:228:30: AIR003 `Variable.get()` outside of a task
+ airflow-core/tests/unit/dag_processing/test_processor.py:268:21: AIR003 `Variable.get()` outside of a task
+ providers/openlineage/tests/system/openlineage/example_openlineage_base_complex_dag.py:56:13: AIR003 `Variable.get()` outside of a task
+ providers/openlineage/tests/system/openlineage/example_openlineage_defer_simple_dag.py:39:14: AIR003 `Variable.get()` outside of a task
+ providers/openlineage/tests/system/openlineage/example_openlineage_mapped_simple_dag.py:46:18: AIR003 `Variable.get()` outside of a task
+ providers/openlineage/tests/system/openlineage/example_openlineage_short_circuit_dag.py:49:22: AIR003 `Variable.get()` outside of a task

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
AIR003 8 8 0 0 0

@Dev-iL
Copy link
Copy Markdown
Contributor Author

Dev-iL commented Feb 26, 2026

Hmm... Looking at the ecosystem results, it appears that this rule needs to be applied more surgically (only in DAG definition scripts). @sjyangkevin @Lee-W any idea how to do that?
Looks like I figured it out!

@jscheffl
Copy link
Copy Markdown

jscheffl commented Mar 2, 2026

Cool!

@Miretpl
Copy link
Copy Markdown

Miretpl commented Mar 3, 2026

Hi @Dev-iL, shouldn't we include pre/post_execute method subclasses of BaseOperator + possible callback defined for them 🤔?

Also, I'm not sure about the following violation example:

# Variable.get() in a regular (non-task) function
def helper():
    return Variable.get("foo")  # AIR005

as it depends on where the function will be used. If it will be passed as a callback, e.g. to PythonOperator (directly via python_callable instead of task decorator), it is good. On the other hand, if it is invoked somewhere in the code on the top level, it will be a violation.

@Dev-iL Dev-iL force-pushed the airflow/variable branch 2 times, most recently from c6295c6 to 8adc896 Compare March 4, 2026 06:58
Copy link
Copy Markdown
Contributor

@sjyangkevin sjyangkevin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing it, some small feedback. I also wonder if we should move those utils to identify taskflow api decorated function to helpers, as it is used across multiple rules.

@Dev-iL Dev-iL force-pushed the airflow/variable branch from 8adc896 to 0bcf153 Compare March 4, 2026 18:34
@Dev-iL
Copy link
Copy Markdown
Contributor Author

Dev-iL commented Mar 4, 2026

Also, I'm not sure about the following violation example:

# Variable.get() in a regular (non-task) function
def helper():
    return Variable.get("foo")  # AIR005

as it depends on where the function will be used. If it will be passed as a callback, e.g. to PythonOperator (directly via python_callable instead of task decorator), it is good. On the other hand, if it is invoked somewhere in the code on the top level, it will be a violation.

Right. This is a limitation of static analysis - we cannot easily determine how a function is used. The current implementation errs on the side of caution, allowing some false positive which the user can suppress with # noqa: AIR005. I can add a note to the rule docs that it is expected to have some false positives.


we should move those utils to identify taskflow api decorated function to helpers, as it is used across multiple rules.

Indeed. This rule was implemented before the shared utilities, and it wasn't refactored after rebasing. I'll import from helpers.


can we make the message more generic and we put specific message when reporting diagnostic. For traditional operator we can suggest to use jinja but if it is a function can be taskflow api decorated we can suggest move in

Agreed. Will provide different suggestions depending on whether the violation is at module level (suggest Jinja) vs in a function (suggest @task decorator).


would be better to check on DAG definition line instead of import

Claude says this:

Important architectural constraint: Ruff uses a single-pass AST visitor. At the point where a Variable.get() call is analyzed, only bindings/statements before it in the file have been processed. This means:

  • Checking for DAG() calls that appear after the Variable.get() line won't work in a single pass
  • However, imports are always at the top, so checking if DAG/dag bindings exist (from imports) is reliable during traversal

DAGs can be defined in 3 ways:

  1. dag = DAG(...) — assignment
  2. with DAG(...) as dag: — context manager
  3. @dag — decorator on a function

Proposed approach: Replace is_dag_file() with a function that iterates global-scope bindings and checks their source statements (via binding.sourcesemantic.statement(node_id)) for actual DAG definitions:

  1. Assignment: source stmt is Assign/AnnAssign whose value is a DAG() call (resolve via semantic.resolve_qualified_name)
  2. Context manager: source stmt is With, check if any with_item.context_expr is a DAG() call
  3. Decorator: source stmt is FunctionDef, check if any decorator resolves to airflow.*.dag

Ordering constraint: Ruff is a single-pass linter. If Variable.get() appears at module level before the DAG definition, the DAG binding won't exist yet. This means we miss that violation. However:

  • In practice, DAG definitions typically appear before task definitions
  • The import-based check would catch this, so we keep imports as a fallback: if there's an import of DAG/dag from airflow, also treat it as a DAG file
  • This hybrid approach avoids false negatives while being more precise than import-only

Copy link
Copy Markdown
Contributor

@Lee-W Lee-W left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few nits, but overall looks good!

@Dev-iL Dev-iL force-pushed the airflow/variable branch from 0bcf153 to 93872da Compare March 5, 2026 08:06
Copy link
Copy Markdown
Contributor

@Lee-W Lee-W left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably let's change it to AIR003? This is likely to be approved first.

Copy link
Copy Markdown
Contributor

@ntBre ntBre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good to me overall, just a few minor suggestions. I also agree with moving this to AIR003 if it's ready to land first, which looks likely!

@Dev-iL Dev-iL force-pushed the airflow/variable branch from 93872da to 383e6f9 Compare March 10, 2026 06:51
@Dev-iL Dev-iL changed the title [airflow] Implement airflow-variable-get-outside-task (AIR005) [airflow] Flag Variable.get() calls outside of task execution context (AIR003) Mar 10, 2026
@Dev-iL Dev-iL requested a review from ntBre March 10, 2026 06:55
Copy link
Copy Markdown
Contributor

@Lee-W Lee-W left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Copy link
Copy Markdown
Contributor

@ntBre ntBre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you!

@ntBre ntBre merged commit 91cc7cd into astral-sh:main Mar 11, 2026
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

preview Related to preview mode features rule Implementing or modifying a lint rule

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants