Skip to content

Fixes related to group_keys default change in pandas 2.0#9855

Merged
jrbourbeau merged 4 commits intodask:mainfrom
j-bennet:j-bennet/9736-upstream-fix-groupby
Jan 24, 2023
Merged

Fixes related to group_keys default change in pandas 2.0#9855
jrbourbeau merged 4 commits intodask:mainfrom
j-bennet:j-bennet/9736-upstream-fix-groupby

Conversation

@j-bennet
Copy link
Copy Markdown
Contributor

@j-bennet j-bennet commented Jan 19, 2023

Related: #9736.

In pandas 2.0, groupby adds group keys to index.

  • Tests added / passed
  • Passes pre-commit run --all-files

@j-bennet j-bennet requested a review from jrbourbeau January 19, 2023 22:19
@j-bennet j-bennet marked this pull request as draft January 20, 2023 01:57
@j-bennet j-bennet changed the title Fix test_full_groupby, pandas 2.0 compatibility [WIP]Fix test_full_groupby, pandas 2.0 compatibility Jan 20, 2023
@j-bennet j-bennet changed the title [WIP]Fix test_full_groupby, pandas 2.0 compatibility [WIP] Fix test_full_groupby, pandas 2.0 compatibility Jan 20, 2023
@j-bennet j-bennet changed the title [WIP] Fix test_full_groupby, pandas 2.0 compatibility Fix test_full_groupby, pandas 2.0 compatibility Jan 23, 2023
@j-bennet j-bennet changed the title Fix test_full_groupby, pandas 2.0 compatibility Fixes related to group_keys default in groupby, pandas 2.0 compatibility Jan 23, 2023
@jrbourbeau jrbourbeau changed the title Fixes related to group_keys default in groupby, pandas 2.0 compatibility Fixes related to group_keys default change in pandas 2.0 Jan 23, 2023
Copy link
Copy Markdown
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @j-bennet! Overall this looks good to me. Just one small suggestion.

Also, I see this is marked as a draft PR. Are there other things you'd still like to include here?

Comment thread dask/dataframe/core.py Outdated
no_default = "__no_default__"

GROUP_KEYS_DEFAULT = None if PANDAS_GT_150 else True
GROUP_KEYS_DEFAULT = None
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this line may not be needed

Suggested change
GROUP_KEYS_DEFAULT = None

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially this is how I had it, but I get a complaint from mypy:

mypy.....................................................................Failed
- hook id: mypy
- exit code: 1

dask/dataframe/core.py:104: error: Incompatible types in assignment (expression has type "None", variable has type "bool")  [assignment]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Thanks for clarifying. I went ahead and pushed a tiny commit (53b2d77) that removes the unused assignment, but still makes mypy happy (hope you don't mind). That should, hopefully, make this easier for future readers to reason about.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your version, it's shorter too. I don't see type hints used much in our codebase, but since we support Python 3.8+, we could be using them more, right? What's the general agreement on using them? More is good, less is good?

@j-bennet j-bennet marked this pull request as ready for review January 23, 2023 23:38
@j-bennet
Copy link
Copy Markdown
Contributor Author

Also, I see this is marked as a draft PR. Are there other things you'd still like to include here?

No, not a draft anymore.

Copy link
Copy Markdown
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @j-bennet

@jrbourbeau jrbourbeau merged commit 391d621 into dask:main Jan 24, 2023
@j-bennet j-bennet deleted the j-bennet/9736-upstream-fix-groupby branch January 25, 2023 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants