Update numeric_only default in quantile for pandas 2.0#9854
Update numeric_only default in quantile for pandas 2.0#9854jrbourbeau merged 7 commits intodask:mainfrom
numeric_only default in quantile for pandas 2.0#9854Conversation
…calculate quantile.
jrbourbeau
left a comment
There was a problem hiding this comment.
Thanks @j-bennet!
Is this part of the larger change in the default of numeric_only (xref #9471)? If so, while I agree this change gets this test passing, I think we'll also want to add numeric_only= support to quantile. Said another way, with pandas=2.0, dask's quantile and pandas' quantile don't have the same behavior.
If we want to mimic Pandas behavior, and have a different default depending on Pandas version, then yes, this needs more changes. I'll get on it. |
jrbourbeau
left a comment
There was a problem hiding this comment.
quantile in Dask already supports numeric_only
Ah, great!
If we want to mimic Pandas behavior, and have a different default depending on Pandas version, then yes, this needs more changes
👍
| assert result.name == 0.5 | ||
| tm.assert_index_equal(result.index, pd.Index(["A", "X", "B"])) | ||
| assert (result == expected[0]).all() | ||
| if numeric_only is False or (PANDAS_GT_200 and numeric_only is None): |
There was a problem hiding this comment.
I opted for an explicit fail branch to state exactly what errors I'm expecting (TypeError in Pandas, but NotImplementedError in Dask), but this can be an xfail instead, what do you think @jrbourbeau ?
|
@jrbourbeau Please take another look at this one when you have the chance. |
…6-upstream-fix-quantile
numeric_only default in quantile for pandas 2.0
jrbourbeau
left a comment
There was a problem hiding this comment.
Thanks @j-bennet! This should be good to go -- will merge after CI finishes
Fix for upstream failure:
dask/dataframe/tests/test_dataframe.py::test_dataframe_quantile.Previously,
numeric_onlywas true by default inquantile, but now we need to set it explicitly.Related: #9736.
pre-commit run --all-files