[REVIEW] Use ignore_index for pandas' group_split_dispatch#6251
Merged
TomAugspurger merged 3 commits intodask:masterfrom Jun 23, 2020
Merged
[REVIEW] Use ignore_index for pandas' group_split_dispatch#6251TomAugspurger merged 3 commits intodask:masterfrom
TomAugspurger merged 3 commits intodask:masterfrom
Conversation
Member
|
I think this makes sense, but would want to check with someone else (cc @TomAugspurger) first before merging. |
Member
TomAugspurger
left a comment
There was a problem hiding this comment.
Looks good, one question.
Comment on lines
+354
to
+356
| if ignore_index: | ||
| df2._meta = df2._meta.reset_index(drop=True) | ||
| return df2 |
Member
There was a problem hiding this comment.
Any reason to do this here rather than in rearrange_column_by_tasks?
Member
Author
There was a problem hiding this comment.
Good question -It seemed to me that I would need to add this in two places in rearrange_by_column_task (since the _simple_rearrange_by_column_tasks code path could be used), but only once in rearrange_by_column.
Member
Author
There was a problem hiding this comment.
Is this reasoning sufficient, or should I move the changes down into rearrange_by_column_task?
TomAugspurger
approved these changes
Jun 23, 2020
Member
|
Sorry for the delay here, thanks! |
kumarprabhu1988
pushed a commit
to kumarprabhu1988/dask
that referenced
this pull request
Oct 29, 2020
* ignore_index bugfix
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
After thinking about this comment from @jcrist in #6247 , I decided to put some effort into making the pandas handling of
ignore_indexingroup_split_dispatchmore consistent with that of cudf (allowing us to test that theignore_indexargument is actually handled). With the changes in this PR, passingignore_index=Truewill ensure that the original index will be replaced with a default RangeIndex.If others feel this is not actually the behavior we want, I'll be happy to close :)
black dask/flake8 dask