Skip to content

Fix overlapping divisions error on append#8997

Merged
jcrist merged 4 commits intodask:mainfrom
ian-r-rose:fix-overlapping-divisions-error-on-append
May 5, 2022
Merged

Fix overlapping divisions error on append#8997
jcrist merged 4 commits intodask:mainfrom
ian-r-rose:fix-overlapping-divisions-error-on-append

Conversation

@ian-r-rose
Copy link
Copy Markdown
Collaborator

We were not being as picky as we should have been when it comes to overlapping divisions upon appending to an existing parquet dataset. Here we also raise if they slightly overlap (that is to say, if the first index value in the new partition == the last index value of the previous partition).

@jcrist jcrist merged commit 52afc8c into dask:main May 5, 2022
erayaslan pushed a commit to erayaslan/dask that referenced this pull request May 12, 2022
* Change test to also look for *slightly* overlapping divisions.

* Also raise if the start of an appended partition is equal to the end of
the last partition, as these should be considered overlapping
partitions.

* Remove commented-out code.

* Remove kwarg which is new in pandas 1.4, the default behavior is fine.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Appending to parquet file seems to break future appending

2 participants