BUG: Fix epoch splits naming#11876
Conversation
agramfort
left a comment
There was a problem hiding this comment.
let me know when it's good to go from your end. Don't forget to update latest.inc file in the doc. thx
|
While working on the PR, found another problem: if the ending is not BIDS, i.e. Also, should |
| with pytest.raises(FileExistsError, match="Destination file"): | ||
| epochs.save(split_fname, split_naming=split_naming, verbose=True) | ||
| os.remove(split_fname) | ||
| # we don't test for reserved files as it's not implemented here |
There was a problem hiding this comment.
I don't know what reserved files means here. I noticed, Raw.save() mentions something reserving the main file when producing splits. Is it something important I should be aware of? For now I just removed this comment.
mne/tests/test_epochs.py
Outdated
| events = mne.make_fixed_length_events(raw, 1) | ||
| epochs = mne.Epochs(raw, events) | ||
| if split_size == "2MB" and (metadata or concat): | ||
| n_files += 1 |
There was a problem hiding this comment.
Not sure what this was doing. Removed it.
There was a problem hiding this comment.
This was a special case inside the test. Basically it was necessary because in the case where your split_size is small and you have metadata (or concat, for whatever reason), you will end up with another file. It doesn't seem like your PR should change the file sizes produced but rather just the filenames, so it seems like a bug (either on this PR or in main currently) that you were able to remove this and have things still pass.
There was a problem hiding this comment.
... looks like it is a bug on main because we never use split_size="2MB", so removing this seems okay
mne/tests/test_epochs.py
Outdated
| @@ -1511,50 +1511,102 @@ def test_split_saving(tmp_path, split_size, n_epochs, n_files, size, metadata, c | |||
| } | |||
| ) | |||
| epochs.metadata = metadata | |||
There was a problem hiding this comment.
I don't see, how the metadata are used in the tests. Do metadata affect epochs.drop_bad()? I don't think they do. Same thing with concat. Maybe they should be removed.
There was a problem hiding this comment.
They were probably added to make sure that we actually split the metadata properly (or copy it? not sure which we do currently) when splitting the epoch data across multiple files
There was a problem hiding this comment.
I've looked into commit history and it seems they are added to fix #7897. My current understanding is that there was a bug with saving splits when the size just marginally exceeded the splitting threshold, so metadata and concat manipulations help catch this corner case. Does this make sense?
Also there's #5102 related to problem with events when loading back the splits.
There was a problem hiding this comment.
there was a bug with saving splits when the size just marginally exceeded the splitting threshold
ah yes I remember that one. Nice git archeology!
@sappelhoff WDYT the BIDS-sensible thing is to do in this case? |
larsoner
left a comment
There was a problem hiding this comment.
@dmalt thanks for working on this!
As it stands, this PR is a little bit tough to review because you're mixing adding functionality with refactoring non-trivial tests that cover weird corner cases. It is hard to see what has been changed to make tests cleaner/better while keeping all existing functionalityl/checks (this is difficult to see) vs what was changed to accommodate the new naming scheme. It makes me worry that we might silently have break/omit some test that we put in place previously to cover some corner case. So even though the tests look cleaner, in some sense they are a bit less safe now.
I would suggest reverting the test changes, then start by adding just a new parametrization like:
@pytest.mark.parametrize("split_naming", ("neuromag", "bids"))
on a test or a few tests. Then you'll need to make a few small naming updates in the tests. This would be digestible from a review standpoint.
Alternatively you could open a fresh PR that only does the test refactoring without adding anything new. This could include stuff like the removal of the if split_size == "2MB" that probably doesn't need to be there. Then we get that green and merged, then rebase this PR on that... and then this PR once again hopefully just adds a parametrize + naming updates as above.
Does that make sense?
|
Yes, you're right, the PR spiralled out of control real quick:) Let me try from scratch. I feel like refactoring tests first would be a better option. The library code changes are easy but testing them without bloating an existing test even more -- not as much. |
IMHO mne-bids is the BIDS-aware software and the feature that is provided via MNE-Python here is more a convenience for mne-bids. So I don't have a strong opinion as we correctly use this in mne-bids. I wouldn't get into too much of a logic branching here and simply assume that users will either:
|
997074b to
651517c
Compare
I was thinking adding a simple check |
As a matter of fact, this check is already implemented. It was just turned off. The filenames for splits are constructed via |
mne/epochs.py
Outdated
| fname = f"{base}-{part_idx:d}{ext}" | ||
| elif split_naming == "bids" and n_parts > 1: | ||
| fname = _construct_bids_filename(base, ext, part_idx + 1) | ||
| _check_fname(fname, overwrite=overwrite) |
There was a problem hiding this comment.
call _check_fname() for both 'neuromag' and 'bids' to fix overwriting bug when
epochs.save('test-epo.fif', overwrite=False) would overwrite existing 'test-epo-1.fif'
mne/epochs.py
Outdated
| if part_idx > 0 and split_naming == "neuromag": | ||
| fname = f"{base}-{part_idx:d}{ext}" | ||
| elif split_naming == "bids" and n_parts > 1: | ||
| fname = _construct_bids_filename(base, ext, part_idx + 1) |
There was a problem hiding this comment.
remove validate=False to prohibit using '-epo.fif' with split_naming='bids'
|
This is ready from my side. |
0097cfb to
91b4a04
Compare
Use already present validation mechanism. For now validation only works when splits will be created, i.e. test-epo.fif with split_naming="bids" is still allowed. Record this behaviour in tests.
[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci
a5fb3e2 to
5a9ddeb
Compare
- add xfail reasons - add match regex to pytest.raises - remove elif, add _check_option
for more information, see https://pre-commit.ci
|
@larsoner our "check rendered docs here" CI isn't working? takes me to the same link as the CircleCI Details link, and there is no artifact to view |
|
It also doesn't run |
|
(By "sign up" I just meant to "log in with GitHub" on the UI when you click "Details", so if you did more than that it would be good to know!) |
Co-authored-by: Daniel McCloy <dan@mccloy.info>
|
OK well I guess no harm in re-running them with the changelog commit then. I'll commit that and then mark for merge when green |
|
ha, @larsoner beats me to it again :) |
* upstream/main: BUG: Fix epoch splits naming (mne-tools#11876)
Honestly, no idea. Sorry :( All I did is logging in with GitHub. It's my first experience with Circle CI, so I might have clicked something by accident. |
Sounds like what you did should have worked, weird! |
And I still have this problem, right? For my next PR I mean. |
|
Yes probably... maybe there is some way you can disable CircleCI building on your fork, you can try messing with settings in https://app.circleci.com/pipelines/github/dmalt/mne-python |
Ok, I think the problem was that I clicked "Set Up Project" next to MNE-Python right after logging in. To be fair, the button was right in the middle of a screen, so it was calling for action:) I deleted the project and created a test PR #11897 and now it seems to be working fine. |
* upstream/main: [pre-commit.ci] pre-commit autoupdate (mne-tools#11911) [BUG, MRG] Remove check on `mne.viz.Brain.add_volume_labels` (mne-tools#11889) Small splits fix (mne-tools#11905) adds niseq package to "Related software" (mne-tools#11909) Minor fixes for ERDS maps example (mne-tools#11904) FIX: Fix pyvista rendering (mne-tools#11896) BUG: Fix epoch splits naming (mne-tools#11876) ENH: Use section-title for HTML anchors in Report (mne-tools#11890) CI: Deploy [circle deploy] MAINT: Clean up whats_new and doc versions (mne-tools#11888) Refactor test_epochs.py::test_split_saving (2 out of 2) (mne-tools#11884) Cross-figure event passing system (mne-tools#11685) MAINT: Post-release deprecations, updates [circle deploy] (mne-tools#11887) MAINT: Release 1.5.0 (mne-tools#11886) [pre-commit.ci] pre-commit autoupdate (mne-tools#11883) Refactor test_epochs.py::test_split_saving (1 out of 2) (mne-tools#11880) FIX: Missing Saccade information in Eyelink File (mne-tools#11877) Improve drawing of annotations with matplotlib (mne-tools#11855) MAINT: Work around NumPy deprecation (mne-tools#11878)
Co-authored-by: Eric Larson <larson.eric.d@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Daniel McCloy <dan@mccloy.info>
Fixes #11870
Fixes overwriting bug, when
epoch.save('test-epo.fif', overwrite=False, split_naming="neuromag")would overwrite"test-epo-1.fif".Disallows filenames like
"test-epo.fif"withsplit_naming="bids"to avoid surprises with split names. File names like"a_b-epo.fif", i.e. having several bids clauses divided by "_" are still allowed. Also we don't do this check if no splits will be created.