Skip to content

pandas 3.x compatibility for .groups#12071

Merged
TomAugspurger merged 3 commits intodask:mainfrom
TomAugspurger:tom/pandas-4-warnings-fix
Sep 15, 2025
Merged

pandas 3.x compatibility for .groups#12071
TomAugspurger merged 3 commits intodask:mainfrom
TomAugspurger:tom/pandas-4-warnings-fix

Conversation

@TomAugspurger
Copy link
Copy Markdown
Member

pandas 3.x is deprecating .groupby(by=by).groups when by= is a length-1 list, so avoid that.

Closes #12065

In pandas-dev/pandas@e191a06,
pandas is setting up a change in behavior when grouping by a length-1 list
and accessing `.groups`. This works around that warning by
not using length-1 lists.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Sep 9, 2025

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

      9 files  ±0        9 suites  ±0   3h 5m 39s ⏱️ - 7m 31s
 18 059 tests ±0   16 842 ✅ ±0   1 217 💤 ±0  0 ❌ ±0 
161 605 runs  ±0  149 488 ✅ +1  12 117 💤  - 1  0 ❌ ±0 

Results for commit f9112c2. ± Comparison against base commit 37835c4.

♻️ This comment has been updated with latest results.

@TomAugspurger
Copy link
Copy Markdown
Member Author

This fix appear to be OK, but there are a couple new failures:

https://github.com/dask/dask/actions/runs/17585475622/job/49952187363?pr=12071#step:11:36267

FAILED dask/dataframe/tests/test_utils_dataframe.py::test_meta_nonempty - pandas.errors.Pandas4Warning: Constructing a Categorical with a dtype and values containing non-null entries not in that dtype's categories is deprecated and will raise in a future version.
FAILED dask/dataframe/tests/test_utils_dataframe.py::test_meta_nonempty_empty_categories - pandas.errors.Pandas4Warning: Constructing a Categorical with a dtype and values containing non-null entries not in that dtype's categories is deprecated and will raise in a future version.
= 2 failed, 16339 passed, 1013 skipped, 335 xfailed, 282 xpassed, 650 warnings in 1485.05s (0:24:45) =

I can address them here.

@TomAugspurger
Copy link
Copy Markdown
Member Author

That latest warning came from something like

pd.Categorical([0, 0], categories=[])

I changed it to use pd.Categorical.from_codes. If we have categories, we know that [0, 0] are valid codes. If we don't have any categories, we can use [-1, -1].

@TomAugspurger TomAugspurger added the run-upstream Add this label to run the upstream-dev job on PRs in CI. label Sep 11, 2025
@TomAugspurger TomAugspurger merged commit 8468786 into dask:main Sep 15, 2025
25 of 26 checks passed
@TomAugspurger TomAugspurger deleted the tom/pandas-4-warnings-fix branch September 15, 2025 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-upstream Add this label to run the upstream-dev job on PRs in CI.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

⚠️ Upstream CI failed ⚠️

2 participants