Skip to content

Stable sort in Series.value_counts for pandas 3.x#12191

Merged
TomAugspurger merged 2 commits intodask:mainfrom
TomAugspurger:tom/pandas-3-value-counts-stable
Dec 12, 2025
Merged

Stable sort in Series.value_counts for pandas 3.x#12191
TomAugspurger merged 2 commits intodask:mainfrom
TomAugspurger:tom/pandas-3-value-counts-stable

Conversation

@TomAugspurger
Copy link
Copy Markdown
Member

pandas 3.x changed the behavior of Series.value_counts to use a stable sort. This changes our value counts aggregation, which uses Series.sort_values, to also use a stable sort, so that we match pandas when sort=True.

xref #12178 (comment)

pandas 3.x changed the behavior of Series.value_counts to use a stable sort.
This changes our value counts aggregation, which uses Series.sort_values,
to also use a stable sort, so that we match pandas when `sort=True`.

xref dask#12178 (comment)
@TomAugspurger
Copy link
Copy Markdown
Member Author

I wasn't able to reproduce this locally, so hopefully CI will tell us whether or not this is fixed.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 10, 2025

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

      9 files  +     2        9 suites  +2   3h 16m 35s ⏱️ + 45m 12s
 18 159 tests +     2   16 944 ✅ +     5   1 215 💤  -     3  0 ❌ ±0 
162 568 runs  +36 141  150 559 ✅ +33 499  12 009 💤 +2 642  0 ❌ ±0 

Results for commit 6e73a96. ± Comparison against base commit 2497ebe.

♻️ This comment has been updated with latest results.

@TomAugspurger
Copy link
Copy Markdown
Member Author

Mmm I'm not too sure about the ubuntu-latest failures.

FAILED dask/dataframe/dask_expr/tests/test_collection.py::test_serialization - assert 2742 < (350 + 2316)
FAILED dask/dataframe/dask_expr/tests/test_collection.py::test_len - AssertionError: assert 100 == 10

I couldn't quickly reproduce them, but I haven't attempted to reproduce the environment exactly.

The good news is that the test passed on the pandas-nightly job: https://github.com/dask/dask/actions/runs/20104462644/job/57684176833?pr=12191#step:11:22183

@TomAugspurger
Copy link
Copy Markdown
Member Author

TomAugspurger commented Dec 11, 2025

I used the same conda environment on a linux machine and was still unable to reproduce this...

If CI fails again I'm tempted to skip these tests on the failing platform.

@TomAugspurger
Copy link
Copy Markdown
Member Author

Weird, it passed this time...

I plan to merge this sometime tomorrow.

@TomAugspurger TomAugspurger merged commit b2707dc into dask:main Dec 12, 2025
23 of 24 checks passed
@TomAugspurger TomAugspurger deleted the tom/pandas-3-value-counts-stable branch December 12, 2025 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants