Skip to content

Incremental binary file writer#4987

Closed
kgabor wants to merge 3 commits intonumpy:mainfrom
kgabor:incremental_binary_file_writer
Closed

Incremental binary file writer#4987
kgabor wants to merge 3 commits intonumpy:mainfrom
kgabor:incremental_binary_file_writer

Conversation

@kgabor
Copy link
Copy Markdown
Contributor

@kgabor kgabor commented Aug 24, 2014

I would like to add a class for writing one (possibly big) .npy file saving multiple (same dtype, compatible shape) arrays. My use case was saving slowly accumulating data regularly for a long time.

This is a first implementation, opening an existing file for append and reading back parts from a very big .npy file would be straightforward next steps. Please comment this idea.

@charris
Copy link
Copy Markdown
Member

charris commented Aug 30, 2014

This looks interesting. Could you make a post on the numpy-discussion mailing list proposing this enhancement?

@rgommers
Copy link
Copy Markdown
Member

rgommers commented Mar 8, 2015

@kgabor was this discussed on the mailing list?

@bsipocz
Copy link
Copy Markdown
Member

bsipocz commented Aug 7, 2015

There was only a single reply mentioning that hdf5 does this already.
http://thread.gmane.org/gmane.comp.python.numeric.general/58695

@bsipocz
Copy link
Copy Markdown
Member

bsipocz commented Mar 2, 2016

@charris, @rgommers - There wasn't much reaction on this on the mailing list. Do you think it can make it into a release at some point?

@charris
Copy link
Copy Markdown
Member

charris commented Mar 2, 2016

Needs rebase.

kgabor added 3 commits March 2, 2016 12:18
To implement the incremential writing of binary .npy files, support for pre-defined header space is added here.
@kgabor kgabor force-pushed the incremental_binary_file_writer branch from dcb9f61 to b734481 Compare March 2, 2016 12:27
@bsipocz
Copy link
Copy Markdown
Member

bsipocz commented Mar 2, 2016

Rebased.

return d

def _write_array_header(fp, d, version=None):
def _write_array_header(fp, d, version=None,fixedheaderlen=0,extrapad=0):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pep8-compliant spacing please :-)

@njsmith
Copy link
Copy Markdown
Member

njsmith commented Mar 3, 2016

@kgabor @bsipocz: little confused about who I'm talking to here, but :-):

idea seems sensible enough to me, and no-one objected, so I guess it's okay. Needs some cleanup though -- see above -- and tests.

@bsipocz
Copy link
Copy Markdown
Member

bsipocz commented Mar 3, 2016

@kgabor @bsipocz: little confused about who I'm talking to here, but :-):

@njsmith - Gabor wrote this routine for our pipeline. Recently I spent some time to clean up parts of it and port them to py3, and thus run into the question whether this ever made it to upstream.
Sitting in the same office makes the workflow smoother ;)

@bsipocz
Copy link
Copy Markdown
Member

bsipocz commented Aug 16, 2019

I'm happy to resurrect this PR in this release cycle.

@rgommers
Copy link
Copy Markdown
Member

@bsipocz yes that would be nice

@mattip
Copy link
Copy Markdown
Member

mattip commented Dec 3, 2019

@bsipocz ping. This missed the 1.18 cutoff

@bsipocz
Copy link
Copy Markdown
Member

bsipocz commented Dec 3, 2019

Yes, sadly there is always more on that plate than time. Maybe during the holidays I'll have more time for passion projects (and frankly I need to warm up some old projects anyway and this was part of a pipeline in one such project).

@anirudh2290
Copy link
Copy Markdown
Member

@bsipocz @kgabor do you still wish to pursue this PR ?

@bsipocz
Copy link
Copy Markdown
Member

bsipocz commented May 20, 2020

We don't actively use the pipeline any more that relied on this (and thus a patched numpy), but doing some contributions to numpy is very much on my wishlist. So I indeed plan to come back unless this is considered feature creep.

@anirudh2290
Copy link
Copy Markdown
Member

@bsipocz thanks for the reply! I havent taken a closer look, but from other comments, looks like it would be nice to have it in.

@bsipocz
Copy link
Copy Markdown
Member

bsipocz commented May 20, 2020

OK, let's put a deadline on this then, e.g. if I don't come back and wrap this up by the Scipy sprint this summer, then it's probably time to face the bitter truth and give up on it.

Base automatically changed from master to main March 4, 2021 02:03
@seberg
Copy link
Copy Markdown
Member

seberg commented Sep 8, 2021

Considering the age of this PR and the fact that it needs to be rebased, I am going to close it. We should probably discuss the API again, but for anyone interested in this work: Please feel free to rebase and open a new PR based on it.

@seberg seberg closed this Sep 8, 2021
@bsipocz
Copy link
Copy Markdown
Member

bsipocz commented Sep 8, 2021

@seberg - 💯 . This was on my mind for a very long time, yet didn't find the time, or the deep breath to finish it off. And I'm the greatest advocate of closing off stale PRs. In my experience, stale closes have the effect of helping let go or give a new boost to dust of old things to finish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants