Skip to content

Update value_counts to return correct name in pandas 2.0#9919

Merged
jrbourbeau merged 3 commits intodask:mainfrom
j-bennet:j-bennet/9736-series-value-counts
Feb 6, 2023
Merged

Update value_counts to return correct name in pandas 2.0#9919
jrbourbeau merged 3 commits intodask:mainfrom
j-bennet:j-bennet/9736-series-value-counts

Conversation

@j-bennet
Copy link
Copy Markdown
Contributor

@j-bennet j-bennet commented Feb 4, 2023

Fix for an upstream failure:

2023-02-03T20:50:29.9817830Z FAILED dask/dataframe/tests/test_dataframe.py::test_value_counts_with_normalize_and_dropna - AssertionError: ('proportion', 'count')
2023-02-03T20:50:29.9817961Z assert 'proportion' == 'count'
2023-02-03T20:50:29.9818058Z   - count
2023-02-03T20:50:29.9818135Z   + proportion

In pandas 2.0, the result Series produced by value_counts() has a different name.

See https://pandas.pydata.org/docs/dev/whatsnew/v2.0.0.html#value-counts-sets-the-resulting-name-to-count.

Xref #9736.
Xref pandas-dev/pandas#49912.

  • Tests added / passed
  • Passes pre-commit run --all-files

Copy link
Copy Markdown
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @j-bennet! Just one minor comment, but overall this looks good


@pytest.mark.skipif(not PANDAS_GT_110, reason="dropna implemented in pandas 1.1.0")
def test_value_counts_with_normalize_and_dropna():
@pytest.mark.parametrize("normalize", [True, False])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time expand test coverage here

Co-authored-by: James Bourbeau <jrbourbeau@users.noreply.github.com>
Copy link
Copy Markdown
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks @j-bennet

@jrbourbeau jrbourbeau changed the title In value_counts, resulting series name changed, pandas 2.0 compatibility Update value_counts to return correct name in pandas 2.0 Feb 6, 2023
@jrbourbeau jrbourbeau merged commit c189ac5 into dask:main Feb 6, 2023
@j-bennet j-bennet deleted the j-bennet/9736-series-value-counts branch February 6, 2023 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants