Skip to content

Fix overlap so that set_index doesn't drop rows#9423

Merged
jsignell merged 3 commits intodask:mainfrom
jsignell:fix-overlap
Sep 15, 2022
Merged

Fix overlap so that set_index doesn't drop rows#9423
jsignell merged 3 commits intodask:mainfrom
jsignell:fix-overlap

Conversation

@jsignell
Copy link
Copy Markdown
Member

@pavithraes pavithraes self-requested a review August 31, 2022 12:56
@pavithraes
Copy link
Copy Markdown
Member

I've merged main to fix the test_parquet[fastparquet] failures :)

@pavithraes pavithraes requested a review from ian-r-rose August 31, 2022 16:55
@jsignell
Copy link
Copy Markdown
Member Author

I am pretty confident of this change, so I will plan to merge this week unless there are comments.

Comment thread dask/dataframe/tests/test_shuffle.py Outdated
assert ddf2.npartitions == 8


def test_set_index_overlap_3():
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we have test_set_index_overlap and then _2 and _3. I thought the only thing changing was the number of partitions but it seems it's not just that.

Would it be better to have either a comment/docstring to these tests that explains better what each test is accomplishing?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it's a lazy name, I did link out to the issue that has the full context. But I'm happy to change the name

Comment thread dask/dataframe/tests/test_shuffle.py Outdated
@jsignell jsignell merged commit f45df2b into dask:main Sep 15, 2022
@jsignell jsignell deleted the fix-overlap branch September 15, 2022 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ddf.set_index with sorted=True drops rows Results of dask.multi.merge_asof depends on npartitions

3 participants