Conversation
|
What would be the right way of determining the size of data help in a bytesio on py2? Is it something that needs to be saved via tell() when we are done writing instead? |
|
I don't know personally, but this seems like the kind of thing that might
get an answer on StackOverflow relatively quickly.
…On Thu, Oct 26, 2017 at 10:23 AM, Martin Durant ***@***.***> wrote:
What would be the right way of determining the size of data help in a
bytesio on py2? Is it something that needs to be saved via tell() when we
are done writing instead?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2741 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AASszCJzlv2qTVCs3bOYf9KWnwvUMwk5ks5swJXZgaJpZM4Ps_sY>
.
|
|
Actually, thinking about it a moment, |
|
@martindurant is there anything that remains to be done here? What's here seems fine to me. My only comment is that there seems to be a fair amount of copy-pasting between filesystem test suites. It might make sense at some point to construct an inheritable test class that others can use for tests. This might be something that we hand to the Arrow folks for use with their HDFS implementation. |
|
I think this is complete enough to be useful. |
|
I think that this needs to be added to the import at It would also be nice to see a roundtrip test with |
|
@martindurant ok to merge? |
|
Yes, I think so. This does not appear explicitly in the docs, but it is a fairly niche use. |
Plus auto-import the back-end
Use UUID for ukey; file may have changed at any time Use temp directory for test server
|
Updated here with the simplifications that went into bytes. Can be merged after #3160, if that is good to go. |
|
@alimanfoo , if this would be useful to you for making in-memory zarr files, then please try it out and see how well it works. |
|
Cool, thank you, I'll take a look.
…On Mon, 28 May 2018, 01:13 Martin Durant, ***@***.***> wrote:
@alimanfoo <https://github.com/alimanfoo> , if this would be useful to
you for making in-memory zarr files, then please try it out and see how
well it works.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2741 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAq8Qvznm1rkNDUIX0cN8dH89R17_uSKks5t20EXgaJpZM4Ps_sY>
.
|
|
Have a few questions. What contexts does this work in (e.g. single threaded, multithreaded, multiprocessing, distributed, etc.)? Also how does this work when someone wants to access this stored data? |
|
There are a couple of examples of round-tripping in the tests, so the following should work so long as we are within one process (sync or thread scheduler, or distributed in-process). If you are not in one process, you would still successfully make the file-like objects of binary data, but would not know which piece was where. That is like persisting a set of keys (binary data in memory) without the global map of which key is where - i.e., not too useful. |
Enough to get to_zarr/from_zarr working
|
With those changes, a simple zarr roundtrip does work. |
From dask/dask#2741 (which can be closed)
|
Closing based on https://github.com/martindurant/filesystem_spec/pull/11#issue-209228566. @martindurant feel free to re-open if needed |
flake8 daskdocs/source/changelog.rstfor all changesand one of the
docs/source/*-api.rstfiles for new APISee dask/fastparquet#215
This is a single global store. It is meant for use only with the Threaded scheduler - not sure how useful it is.