Use parquet read speed-ups from fastparquet.api.paths_to_cats.#5821
Merged
TomAugspurger merged 3 commits intodask:masterfrom Feb 5, 2020
Merged
Use parquet read speed-ups from fastparquet.api.paths_to_cats.#5821TomAugspurger merged 3 commits intodask:masterfrom
fastparquet.api.paths_to_cats.#5821TomAugspurger merged 3 commits intodask:masterfrom
Conversation
Before this change, code was duplicated from fastparquet. After this change, an optimized function is imported from fastparquet following dask/fastparquet#471 Since fastparquet versions range is not explicit in dask, the import is for now made optional, reverting to existing implementation if using older fastparquet.
fastparquet.api.paths_to_cats.fastparquet.api.paths_to_cats.
rjzamora
reviewed
Jan 22, 2020
mrocklin
reviewed
Jan 26, 2020
Member
|
OK, I've released fastparquet 0.3.3 to PyPI. CF will start building in a bit. For now, let's merge this. I opened #5865 as a followup for removing this compat code. |
Member
|
Thanks @ig248! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR propagates parquet partition read speed-ups merged in dask/fastparquet#471
Before this change, code was duplicated from
fastparquet.After this change, an optimized function is imported from
fastparquet.Since
fastparquetversions range is not explicit indask,the import is for now made optional, reverting to existing implementation if
using older
fastparquet.@rjzamora I am not entirely happy with "try-catch imports" and would rather have
fastparquetas e.g. an optional dependency with well-specced version ranges, but for now this could be a work-around.black dask/flake8 daskPotentially related issues
#5272
#4701