Skip to content

Propagate contextvars to worker threads; catch warnings in 3.14t#12224

Merged
crusaderky merged 1 commit intodask:mainfrom
crusaderky:contextvars
Feb 4, 2026
Merged

Propagate contextvars to worker threads; catch warnings in 3.14t#12224
crusaderky merged 1 commit intodask:mainfrom
crusaderky:contextvars

Conversation

@crusaderky
Copy link
Copy Markdown
Collaborator

@crusaderky crusaderky commented Jan 6, 2026

Context

Python 3.14t introduces two substantial changes that impact Dask:

  • contextvars are propagated from the parent thread to new threads that are being spawned. When it comes to thread pools, however, it means that a worker thread may be already running when the user calls submit, and thus won't get the change in context.
  • warnings.catch_warnings now is backed by contextvars, whereas previously was process-global. This also impacts @pytest.mark.filterwarnings.

This change in CPython was necessary to allow catching warnings in threads without interfering with other threads.
It's worth noting that, for Dask, this behaviour is relevant also in GIL-enabled builds (think about divisions by zero in numpy). On Python 3.14t, this behaviour is on by default. On 3.14, it is opt-in with

python -X thread_inherit_context=1 -X context_aware_warnings=1

or with equivalent environment variables.
See upstream documentation:

Impact

The direct consequence for most Dask users is that catching warnings on the threaded scheduler becomes erratic on 3.14t, because dask.threaded keeps thread pools warm across calls to get/compute/persist.

Consider this:

import warnings
import dask.array as da

a = da.zeros(1)
a.compute()

b = da.zeros(1) / da.zeros(1)
with warnings.catch_warnings(category=RuntimeWarning, action='ignore'):
    b.compute()

the above fails to catch the warning on 3.14t. If you comment out a.compute(), however, it will work again.

Same will happen here:

import dask.array as da
import pytest


def test_1():
    a = da.zeros(1)
    a.compute()


@pytest.mark.filterwarnings("ignore:invalid value encountered in divide:RuntimeWarning")
def test_2():
    b = da.zeros(1) / da.zeros(1)
    b.compute()

pytest will fail with RuntimeWarning. pytest -k test_2 will pass.
In-depth discussion here: pytest-dev/pytest#14077

This PR

  • scheduler="threads" (which is the default for dask.array and dask.dataframe when there is no distributed.Client) will now capture the contextvars on the thread calling get/compute/persist and replicate them in the worker threads. This is on all Python versions and regardless of thread pool warmup.
  • Fix regression on Python 3.14t where warnings.catch_warnings and @pytest.mark.filterwarnings would not catch anything if the thread pool was already warm (e.g. if another test already called get/compute/persist).



@pytest.mark.parametrize("num_workers", [None, 1])
def test_context_aware_warnings(num_workers):
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.14t CI run, where this test becomes meaninful, in #12223

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 6, 2026

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
0 files   ±0   0 ❌ ±0 

Results for commit f5db9d5. ± Comparison against base commit 825904a.

♻️ This comment has been updated with latest results.

@crusaderky crusaderky marked this pull request as ready for review January 7, 2026 10:38
@crusaderky
Copy link
Copy Markdown
Collaborator Author

CI failures are unrelated

@crusaderky
Copy link
Copy Markdown
Collaborator Author

@jacobtomlinson @jsignell does either of you have bandwidth to offer review?

Copy link
Copy Markdown
Member

@jsignell jsignell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say I 100% understand all the pieces of this, but I read it through and tried it out. I works well and seems like a good improvement. Does it have any negative repercussions for Python<3.14t?

@crusaderky
Copy link
Copy Markdown
Collaborator Author

crusaderky commented Feb 3, 2026

Does it have any negative repercussions for Python<3.14t?

On 3.13 I'm observing an extra 5~10 μs / task worth of overhead:

>>> import dask.array as da
>>> a = da.ones(5000, chunks=1).sum(split_every=2)
>>> len(a.dask)
15005
>>> _ = a.compute()  # thread pool warm-up
>>> %time a.compute()

# BEFORE
CPU times: user 1.87 s, sys: 244 ms, total: 2.11 s
Wall time: 1.85 s

# AFTER
CPU times: user 1.97 s, sys: 215 ms, total: 2.19 s
Wall time: 1.96 s

IMHO it's a reasonable price to pay to have consistent behaviour across all Python versions.

@jsignell are you happy to merge?

@crusaderky
Copy link
Copy Markdown
Collaborator Author

I'm also observing no performance downgrade if I define 1000 contextvars globally:

>>> import contextvars
>>> vars = [contextvars.ContextVar(f"c{i}") for i in range(1000)]
>>> for var in vars:
...     var.set(123)

@jsignell
Copy link
Copy Markdown
Member

jsignell commented Feb 3, 2026

Yeah I am fine with merging.

@crusaderky crusaderky merged commit 08f5daa into dask:main Feb 4, 2026
46 of 47 checks passed
@crusaderky crusaderky deleted the contextvars branch February 4, 2026 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants