DOC: DataFrame→Array conversion and unknown chunks#4516
DOC: DataFrame→Array conversion and unknown chunks#4516jrbourbeau merged 6 commits intodask:masterfrom
Conversation
|
I had some debate over whether to include this in the DataFrame or Array documentation. I chose the Array documentation because the chunks are an Array object. |
|
Thanks for adding this @stsievert! It might be worth including a note that |
|
Thanks @jrbourbeau! I've added the note. I mention that this enables downstream computations, but don't point to any examples in case they're fixed (e.g., slicing an array raises a I can see another use case with arrays: >>> x = np.random.choice([-1, 0, 1], size=100)
>>> y = da.from_array(x, chunks=50)
>>> y[y != -1]
# dask.array<getitem, shape=(nan,), dtype=int64, chunksize=(nan,)>I think computing the chunk size could be useful (e.g., with the slicing example above). Looks like #3293 (comment) is the relevant work. |
|
After looking at this again, converting a Dask DataFrame to a Dask array (and the issue of |
c137456 to
d1efd1f
Compare
|
Thanks for that @jrbourbeau! I think that's a better place, and I still link to it from chunks page. I also improved that page a bit – that page didn't have a mention of |
jrbourbeau
left a comment
There was a problem hiding this comment.
A few nitpicky comments. Otherwise LGTM
Co-Authored-By: stsievert <stsievert@users.noreply.github.com>
Co-Authored-By: stsievert <stsievert@users.noreply.github.com>
Co-Authored-By: stsievert <stsievert@users.noreply.github.com>
|
Thanks @stsievert! |
* DOC: DataFrame chunks when converting to array * DOC: add note about immediate computation * MAINT: move doc note to array creation * Update docs/source/array-chunks.rst Co-Authored-By: stsievert <stsievert@users.noreply.github.com> * Update docs/source/array-creation.rst Co-Authored-By: stsievert <stsievert@users.noreply.github.com> * Update docs/source/array-creation.rst Co-Authored-By: stsievert <stsievert@users.noreply.github.com>
What does this PR implement?
This PR provides documentation for converting converting a Dask DataFrame to a Dask Array and computing
chunksin the process (sochunksis notnan).Reference issues/PRs
to_dask_arraywhich I didn't realize existed."—Converting Dask Array of unknown shape to known shape (at the cost of computing) #3293 (comment)