Incremental binary file writer by kgabor · Pull Request #4987 · numpy/numpy

kgabor · 2014-08-24T00:42:10Z

I would like to add a class for writing one (possibly big) .npy file saving multiple (same dtype, compatible shape) arrays. My use case was saving slowly accumulating data regularly for a long time.

This is a first implementation, opening an existing file for append and reading back parts from a very big .npy file would be straightforward next steps. Please comment this idea.

charris · 2014-08-30T17:20:57Z

This looks interesting. Could you make a post on the numpy-discussion mailing list proposing this enhancement?

rgommers · 2015-03-08T22:00:10Z

@kgabor was this discussed on the mailing list?

bsipocz · 2015-08-07T11:57:59Z

There was only a single reply mentioning that hdf5 does this already.
http://thread.gmane.org/gmane.comp.python.numeric.general/58695

bsipocz · 2016-03-02T02:15:56Z

@charris, @rgommers - There wasn't much reaction on this on the mailing list. Do you think it can make it into a release at some point?

charris · 2016-03-02T02:39:26Z

Needs rebase.

To implement the incremential writing of binary .npy files, support for pre-defined header space is added here.

bsipocz · 2016-03-02T14:49:50Z

Rebased.

njsmith · 2016-03-03T00:13:43Z

numpy/lib/format.py

    return d

-def _write_array_header(fp, d, version=None):
+def _write_array_header(fp, d, version=None,fixedheaderlen=0,extrapad=0):


pep8-compliant spacing please :-)

njsmith · 2016-03-03T00:33:51Z

@kgabor @bsipocz: little confused about who I'm talking to here, but :-):

idea seems sensible enough to me, and no-one objected, so I guess it's okay. Needs some cleanup though -- see above -- and tests.

bsipocz · 2016-03-03T13:33:08Z

@kgabor @bsipocz: little confused about who I'm talking to here, but :-):

@njsmith - Gabor wrote this routine for our pipeline. Recently I spent some time to clean up parts of it and port them to py3, and thus run into the question whether this ever made it to upstream.
Sitting in the same office makes the workflow smoother ;)

bsipocz · 2019-08-16T04:52:18Z

I'm happy to resurrect this PR in this release cycle.

rgommers · 2019-08-16T11:36:17Z

@bsipocz yes that would be nice

mattip · 2019-12-03T07:34:30Z

@bsipocz ping. This missed the 1.18 cutoff

bsipocz · 2019-12-03T08:15:47Z

Yes, sadly there is always more on that plate than time. Maybe during the holidays I'll have more time for passion projects (and frankly I need to warm up some old projects anyway and this was part of a pipeline in one such project).

anirudh2290 · 2020-05-20T20:59:29Z

@bsipocz @kgabor do you still wish to pursue this PR ?

bsipocz · 2020-05-20T21:04:09Z

We don't actively use the pipeline any more that relied on this (and thus a patched numpy), but doing some contributions to numpy is very much on my wishlist. So I indeed plan to come back unless this is considered feature creep.

anirudh2290 · 2020-05-20T21:19:00Z

@bsipocz thanks for the reply! I havent taken a closer look, but from other comments, looks like it would be nice to have it in.

bsipocz · 2020-05-20T21:25:14Z

OK, let's put a deadline on this then, e.g. if I don't come back and wrap this up by the Scipy sprint this summer, then it's probably time to face the bitter truth and give up on it.

seberg · 2021-09-08T17:03:15Z

Considering the age of this PR and the fact that it needs to be rebased, I am going to close it. We should probably discuss the API again, but for anyone interested in this work: Please feel free to rebase and open a new PR based on it.

bsipocz · 2021-09-08T17:07:48Z

@seberg - 💯 . This was on my mind for a very long time, yet didn't find the time, or the deep breath to finish it off. And I'm the greatest advocate of closing off stale PRs. In my experience, stale closes have the effect of helping let go or give a new boost to dust of old things to finish.

rgommers added 01 - Enhancement component: numpy.lib labels Mar 8, 2015

kgabor added 3 commits March 2, 2016 12:18

Adding optional keyword arguments to _write_array_header.

c4d3e3f

To implement the incremential writing of binary .npy files, support for pre-defined header space is added here.

Adding IncrementalWriter for incremental binary .npy file writing

3e34663

Docs update for IncrementalWriter

b734481

kgabor force-pushed the incremental_binary_file_writer branch from dcb9f61 to b734481 Compare March 2, 2016 12:27

njsmith reviewed Mar 3, 2016
View reviewed changes

seberg added 54 - Needs decision 55 - Needs work labels Apr 26, 2019

anirudh2290 added the 61 - Stale label May 20, 2020

anirudh2290 removed the 61 - Stale label May 20, 2020

Base automatically changed from master to main March 4, 2021 02:03

seberg closed this Sep 8, 2021

mattip mentioned this pull request Sep 8, 2022

ENH: allow NumPy created .npy files to be appended in-place #20321

Merged

Uh oh!

Conversation

kgabor commented Aug 24, 2014

Uh oh!

charris commented Aug 30, 2014

Uh oh!

rgommers commented Mar 8, 2015

Uh oh!

bsipocz commented Aug 7, 2015

Uh oh!

bsipocz commented Mar 2, 2016

Uh oh!

charris commented Mar 2, 2016

Uh oh!

bsipocz commented Mar 2, 2016

Uh oh!

njsmith Mar 3, 2016

Choose a reason for hiding this comment

Uh oh!

njsmith commented Mar 3, 2016

Uh oh!

bsipocz commented Mar 3, 2016

Uh oh!

bsipocz commented Aug 16, 2019

Uh oh!

rgommers commented Aug 16, 2019

Uh oh!

mattip commented Dec 3, 2019

Uh oh!

bsipocz commented Dec 3, 2019

Uh oh!

anirudh2290 commented May 20, 2020

Uh oh!

bsipocz commented May 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anirudh2290 commented May 20, 2020

Uh oh!

bsipocz commented May 20, 2020

Uh oh!

seberg commented Sep 8, 2021

Uh oh!

bsipocz commented Sep 8, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

bsipocz commented May 20, 2020 •

edited

Loading