Skip to content

[Bug fix] Pass ignore_index to dd_shuffle from DataFrame.shuffle#6247

Merged
jcrist merged 2 commits intodask:masterfrom
rjzamora:ignore-index-bugfix
May 27, 2020
Merged

[Bug fix] Pass ignore_index to dd_shuffle from DataFrame.shuffle#6247
jcrist merged 2 commits intodask:masterfrom
rjzamora:ignore-index-bugfix

Conversation

@rjzamora
Copy link
Member

@rjzamora rjzamora commented May 27, 2020

Very trivial fix for (recently merged) #6066 - The ignore_index keyword is not being passed properly.

  • Tests added / passed
  • Passes black dask / flake8 dask

@rjzamora
Copy link
Member Author

Sorry for missing this in the original PR @TomAugspurger

@jcrist
Copy link
Member

jcrist commented May 27, 2020

LGTM. Is there a test that should be added that could have caught this bug?

@rjzamora
Copy link
Member Author

Is there a test that should be added that could have caught this bug?

The "bug" shows up in cudf/dask_cudf, because ignore_index=True is actually used in group_split_dispatch. However, the pandas implementation doesn't do anything with this argument, so there doesn't seem to be a simple dask.dataframe test.

@jcrist
Copy link
Member

jcrist commented May 27, 2020

Fine by me. Merging!

@jcrist jcrist merged commit 27d52b0 into dask:master May 27, 2020
@rjzamora
Copy link
Member Author

Thanks @jcrist !

Note that your test question made me think a bit more about the differences between pandas- and cudf-backed DataFrame behavior here. It may make sense to explicitly drop the index when using ignore_index=True. These changes probably give us what we want - Let me know if you think it's worth pushing another small PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants