From @britta-wstnr use case if a FIF file is written one sample per data chunk:
import mne
import numpy as np
data = np.random.RandomState(0).randn(32, 5000 * 8 * 60)
info = mne.create_info(32, 5000., "mag")
raw = mne.io.RawArray(data, info)
raw.save("test_default_raw.fif", overwrite=True)
raw.save("test_1samp_raw.fif", buffer_size_sec=1. / raw.info["sfreq"], overwrite=True) # takes ages
you get a much slower read:
$ time python -c 'import mne; mne.io.read_raw_fif("test_default_raw.fif").load_data()'
real 0m1.496s
$ time python -c 'import mne; mne.io.read_raw_fif("test_1samp_raw.fif").load_data()'
real 0m27.146s
in principle if we know all tags are contiguous on disk, we can use a structured np.dtype consisting of tag header plus tag data to read in all necessary tags at once then populate our output data from the tag data (discarding the tag header), avoiding the costly loop.
I'll try to take a look at some point!
From @britta-wstnr use case if a FIF file is written one sample per data chunk:
you get a much slower read:
in principle if we know all tags are contiguous on disk, we can use a structured
np.dtypeconsisting of tag header plus tag data to read in all necessary tags at once then populate our output data from the tag data (discarding the tag header), avoiding the costly loop.I'll try to take a look at some point!