Hi everyone,
Mid-term MNE-Python user here. I have never run into saving FIF files before, because if I have a dataset with around 6GB worth of data, whenever I save my epochs, MNE-Python usually parcellates them into many < 2GB files and appends "-1", "-2" onto the filenames. Then, all I have to do is load the main file and it automatically detects that a split has happened and iteratively loads all these < 2GB divisions of the FIF data and puts it all back together again into the varriable assigned to mne.read_epochs.
Now as I am dealing with more and more data, I notice a few edge cases. It seems sometimes that I get the error telling me I can't save a file due to this < 2GB restriction, when this is much smaller than another file that has saved successfully.
My analysis pipeline requires larger subsets and smaller subsets taken from the same data. The large data saves fine (MNE-Python splits the files equally) but I am noticing now that in many of the cases where I extract subsets, the splitting fails and it tries to create > 2GB files in my current directory and then calls the error.
I am not sure how it works but it looks like there is a calculation of sorts that determines what to do when the data is large. The subsets could have easily been split up multiple < 2GB files, but this isn't happening if the data is, say, slightly over the limit.
I can describe the saving process as having three main cases:
- Data is < 2GB,
epochs.save works fine
- Data is >>>> 2GB (way over),
epochs.save splits the files and successfully saves
- Data is "slightly" over 2GB,
epochs.save doesn't split, calls exception
Does anyone have any insight into the exact condition checking and can see if there is a bug somewhere? It's awkward to keep splitting the data for saving and concatenating them back together when I need them. Those processes take a fair bit of time and ideally I would like not to have to split them arbitrarily when they're slightly over the limit.
If it's not a bug, then that's fine. I am just interested in knowing why it works that way.
So, if anyone can shed some light, it's be very much appreciated.
I was trying to think of a code snippet to demonstrate the example, but that's problematic given it's a large-data problem and not something easily demonstrated with a nice, clean, reproducible example (sorry!)
Best
Alex
Hi everyone,
Mid-term MNE-Python user here. I have never run into saving FIF files before, because if I have a dataset with around 6GB worth of data, whenever I save my epochs, MNE-Python usually parcellates them into many < 2GB files and appends "-1", "-2" onto the filenames. Then, all I have to do is load the main file and it automatically detects that a split has happened and iteratively loads all these < 2GB divisions of the FIF data and puts it all back together again into the varriable assigned to
mne.read_epochs.Now as I am dealing with more and more data, I notice a few edge cases. It seems sometimes that I get the error telling me I can't save a file due to this < 2GB restriction, when this is much smaller than another file that has saved successfully.
My analysis pipeline requires larger subsets and smaller subsets taken from the same data. The large data saves fine (MNE-Python splits the files equally) but I am noticing now that in many of the cases where I extract subsets, the splitting fails and it tries to create > 2GB files in my current directory and then calls the error.
I am not sure how it works but it looks like there is a calculation of sorts that determines what to do when the data is large. The subsets could have easily been split up multiple < 2GB files, but this isn't happening if the data is, say, slightly over the limit.
I can describe the saving process as having three main cases:
epochs.saveworks fineepochs.savesplits the files and successfully savesepochs.savedoesn't split, calls exceptionDoes anyone have any insight into the exact condition checking and can see if there is a bug somewhere? It's awkward to keep splitting the data for saving and concatenating them back together when I need them. Those processes take a fair bit of time and ideally I would like not to have to split them arbitrarily when they're slightly over the limit.
If it's not a bug, then that's fine. I am just interested in knowing why it works that way.
So, if anyone can shed some light, it's be very much appreciated.
I was trying to think of a code snippet to demonstrate the example, but that's problematic given it's a large-data problem and not something easily demonstrated with a nice, clean, reproducible example (sorry!)
Best
Alex