ENH: make apply_function aware of channel index by dominikwelke · Pull Request #12206 · mne-tools/mne-python

dominikwelke · 2023-11-14T12:28:18Z

this PR makes the index of currently processed channel available within custom functions applied to data.
this is needed for processing channels differently during parallel processing, as requested in #10034 .

dominikwelke · 2023-11-14T12:32:27Z

uses inspect.getfullargspec to check the custom function for input arguments, as suggested by @larsoner .
currently I only allow a specific variable name (ch_idx) to be detected - please suggest an alternative naming, if appropriate.
alternatively, one could just check for number of requested variables, but i think this would make things messier.

initial commit only for Base.raw. if you are happy with the solution i will copy to epochs and evoked

dominikwelke · 2023-11-14T12:42:19Z

I can also extend the code to support ch_name.

mne/io/tests/test_apply_function.py

Co-authored-by: Mathieu Scheltienne <mathieu.scheltienne@gmail.com>

mne/io/tests/test_apply_function.py

mne/io/base.py

dominikwelke · 2023-11-15T12:24:48Z

quick question for you @drammock :
when porting the functionality to evoked arrays I noticed that Evoked.apply_function did not currently have the option to process the whole array in place (-> channel_wise=False).

discussing with @wmvanvliet and @britta-wstnr we couldnt think of a good reason why this should be intentional, so I implemented this functionality analogous to Rawand Epochs.
I did so in a separate commit (f88ccd0), so no problem to revert if there was a good reason..

doc/changes/devel.rst

mne/epochs.py

mne/io/base.py

mne/io/tests/test_apply_function.py

mne/tests/test_epochs.py

mne/tests/test_evoked.py

hoechenberger · 2023-11-16T13:18:26Z

I like the functionality you're adding here!

What I found a little surprising, though, is that we allow functions that by themselves accept parameters passed by position, but then we insist on specific parameter names. I have never seen this anywhere in any other Python project I've worked on. If we want to enforce a parameter name, we should probably insist on a keyword-only argument

hoechenberger · 2023-11-17T14:00:28Z

I mean seriously, if we have to use inspection just to find out how to call a function in an API that we define, doesn't it sound like a hack to you immediately? To me it does...

mscheltienne · 2023-11-17T14:01:38Z

you allow for a function signature that has positional parameters, but you force users to adopt a specific name for their positional parameters. This is highly unusual

Would you be OK if we force the signature to be one of:

f(x)
f(x, *, ch_name)
f(x, *, ch_id)
f(x, *, ch_name, ch_id)
f(x, *, ch_id, ch_name)

using the same function signature but with a different name of the parameter changes function behavior. This, again, is something I have never seen done anywhere and it would cause a great deal of confusion to me if I couldn't name my positional parameters the way I want to

I'm not sure I follow this one, anyway the function signature can not be different from the one mentionned above. f(x, foo) would raise, I think?

cbrnr · 2023-11-17T14:03:21Z

Because I find clearer from a user perspective to provide a function with a signature f(x, ch_name) rather than f(x) which will have its signature silently modified to f(x, ch_name) because of this extra argument + now f(x) will have some undefined ch_name variable in its scope.

I'm not sure we're talking about the same thing. I'm not proposing to silently modify the function signature.

hoechenberger · 2023-11-17T14:04:29Z

f(x)
f(x, *, ch_name)
f(x, *, ch_id)
f(x, *, ch_name, ch_id)
f(x, *, ch_id, ch_name)

Yes, this would work if we want to go with kw-args

Personally I'd prefer to pass by position but the above would be acceptable for me too

hoechenberger · 2023-11-17T14:09:29Z

... would still require inspection though ... but I think it keeps it to minimum at least

I'd prefer to have a solution that can do without any introspection ideally

I can see why you don't like @cbrnr's proposal so much though

OTOH this is how we've been doing things in MNE-Python in many places for many years

So I don't understand why this is immediately off the table now?

I'd love to hear @agramfort weigh in if he has time

hoechenberger · 2023-11-17T14:11:20Z

Also I'd like to say: thanks for everyone contributing here, be it code or by sharing their opinion in this discussion! 🤗

drammock · 2023-11-17T18:39:30Z

After a discussion with @britta-wstnr, @larsoner and @dominikwelke we are all in favor of the current implementation (as is I think @wmvanvliet and @mscheltienne also are?). Here is the rationale as I see it:

currently, we impose quite minimal constraints on the user-passed function: it must take 1 argument (can be called anything: x, foo, whatever) that is a NumPy array, and return a NumPy array of the same dimensions.
if we were only adding 1 more supported parameter, we could say "interpreted positionally, call it whatever you want". But because we're adding 2 more supported parameters, we must either:
- force an order on the new params which means that if ch_idx comes before ch_name, then ch_idx must always be in the signature even if the user only wants/needs to use ch_name. That is slightly burdensome on the user.
- allow free order on the new params which means that we must either:
  - force the user to make them kw-only, which means explaining what the * in the function signature means, and forcing them to use it. This is also slightly burdensome on the user.
  - allow the user to make the params technically positional and in either order, and we inspect the signature to see what params are there, and pass in (by name) one or both of the two extra params that we're offering to supply. This boils down to (as @hoechenberger said):
    
    allowing for positional parameters but determine their functionality based on their name.

I agree that this is unusual. But to me the question from the UX perspective is: do we compel the user to write their one-off channel-modifying function to follow best-practices around APIs? Or do we say "its first arg must be the data, and it can have 1 or 2 others (ch_name and/or ch_idx) if you want/need them." To me this is a perfect use case for inspect --- it allows us to make the user's job slightly easier, reduces the amount of explanation we have to give around how to craft these functions, and as long as they follow the instructions about param naming they don't need to think/worry about param order.

cbrnr · 2023-11-17T18:43:41Z

I am still not in favor, but whatever.

hoechenberger · 2023-11-17T18:44:12Z

I see your arguments but I disagree that this makes it easier for users. Like you said, it's unusual, hence likely to cause confusion or at least additional cognitive load. It sets MNE's behavior apart from the rest of the Python ecosystem by introducing some kind of meta language feature, and thereby adds inconsistencies without any need to do so.

My 2 ct

hoechenberger · 2023-11-17T18:45:45Z

This is not a veto. But I want it to be noted that I think this implementation is a bad idea.

drammock · 2023-11-17T18:55:24Z

Like you said, it's unusual, hence likely to cause confusion or at least additional cognitive load.

For transparency, my perspective rests on the assumptions that

it seems unusual only because we have intermediate/advanced knowledge of how Python works
many/most of our users are not Python experts, and will not even notice that there is anything unusual about this at all, because they are not calling their own function (we are calling it behind the scenes).

Personally I had been coding in Python a few years before even seeing an * in a function signature, and then I had to look up what it meant. Should we require users have that bit of background knowledge, if we can safely do what needs doing without requiring it?

hoechenberger · 2023-11-17T19:13:58Z

Like you said, it's unusual, hence likely to cause confusion or at least additional cognitive load.

For transparency, my perspective rests on the assumptions that

it seems unusual only because we have intermediate/advanced knowledge of how Python works

Exactly. That's why it should be avoided, because as soon as users start gaining experience and using other libraries like scikit-learn, they'll experience confusion due to this inconsistency.

Also, we're being inconsistent within MNE itself. We don't do this stuff anywhere else even within MNE AFAIK, which IMHO is reason enough to seriously reconsider this decision.

many/most of our users are not Python experts, and will not even notice that there is anything unusual about this at all, because they are not calling their own function (we are calling it behind the scenes).

We're asking users to define their own function that does NumPy data manipulation but don't trust they'll be able to write a function signature or know any Python basics? This doesn't make sense, in my opinion – it's a contradiction in itself.

Personally I had been coding in Python a few years before even seeing an * in a function signature, and then I had to look up what it meant.

kw-only arguments are relatively new and advanced stuff. Hence I'm not a big fan of having them here – I only brought them up because they'd be the clean solution to the idea of passing parameters by name. If I could choose, I'd define a fixed signature for the callback, and pass things by position. This is what happens with Matplotlib callbacks, with scikit-learn scorers etc.

Should we require users have that bit of background knowledge, if we can safely do what needs doing without requiring it?

I'm just saying by trying to make things easier for absolute beginners, you're adding confusion to anyone wiling to advance or already having an advanced level of Python proficiency. And adding inconsistent behavior inside MNE-Python itself.

We're adopting SPECs and Black coding styles and whatnot, but when it comes to functions, we start our very own special thing because MNE-Python is so special. Sorry, err… no. Just no. I mean do it if you want to, but I don't like it. At all. I do agree that Python is too difficult for beginners, though.

hoechenberger · 2023-11-17T19:25:48Z

Thanks for the meeting summaries and explanations, @drammock

I know it's difficult to keep everyone involved & informed here while we're working and discussing so asynchronously. I appreciate your effort.

larsoner · 2023-11-17T19:36:02Z

@dominikwelke I think you should be okay to proceed with the inspect-based implementation

drammock · 2023-11-17T21:10:04Z

as soon as users start gaining experience and using other libraries like scikit-learn, they'll experience confusion due to this inconsistency.

I think this is an assumption without evidence. They might not notice or remember the difference, they might be confused, or they might think (with curiosity or even admiration) "huh, I wonder how MNE-Python pulled that off?"

We don't do this stuff anywhere else even within MNE

Depends on what you mean by "this stuff". We do use inspect elsewhere. I don't think we take this specific approach to callables elsewhere, but we've also AFAIK never needed to (i.e., never had a motivation to be flexible about the call signature of a passed-in callable). Now we have that motivation. FWIW I don't think we should change our whole codebase to do this for all callables, it wouldn't make sense.

We're asking users to define their own function that does NumPy data manipulation but don't trust they'll be able to write a function signature or know any Python basics?

to me this isn't about trusting the user, it's about making life easier for them. We do this all over the place by allowing list/tuple/array, for example, or by allowing them to pass "mean" instead of passing the actual function np.mean. Here, we're saving them the trouble of writing the extra *, in their callable, or even knowing about how to specify a keyword-only param.

kw-only arguments are relatively new and advanced stuff.

Wait, are they a "Python basic" or are they "advanced stuff"?

but when it comes to functions, we start our very own special thing because MNE-Python is so special.

Let me give an analogy. I want to use a function that someone else wrote. I see it has a signature def divide(foo, bar):. Of course I read the docstring, because I don't know if I'm supposed to pass dividend, divisor or the other way around. If instead the signature was def divide(*, foo, bar): that doesn't help me at all because I still need to know what to pass in as foo and what to pass in as bar. Now instead imagine the signature was def divide(dividend, divisor):. Great! Even if there's no docstring, I feel fairly safe writing assert divide(6, 2) == 3 (positional) or assert divide(divisor=2, dividend=6) == 3 (keyword).

Now turn the perspective around. The "user" in this case is not me, it is MNE-Python's code. It needs to call a function that someone else wrote. It cannot read that function's docstring, and it cannot (easily) read that function's source code to figure out how it works. But what can it do? It can look at the param names, and say "aha! I have a dividend and a divisor here, and I see param names dividend and divisor in the function signature. I feel confident that I can pass dividend=dividend, divisor=divisor in either order and expect to get the right result regardless of the order they appeared in the function signature and regardless of whether the function would have accepted them as positional!"

To me that is a case of our code handling realistic, real-world user input (callables) flexibly, smartly and gracefully. It is not us "starting our very own special thing".

With that, I'm going to sign off of this discussion. My goal here is not necessarily to change your vote @hoechenberger, but hopefully lead you to feel less vehemently opposed to the will of the majority. You're welcome to have the last word for posterity, if you want it.

hoechenberger · 2023-11-17T21:17:19Z

You're entirely ignoring the possibility of passing parameters by position, or @cbrnr's proposal of adding a switch to control the behavior, like we do elsewhere. I don't know why you're providing an example of good vs bad parameter naming, it just seems out of place.

wmvanvliet · 2023-11-20T10:26:40Z

Given that this is a pretty niche bit of functionality, could we perhaps YAGNI a bit and only give the channel index, not its name? So either f(x) or f(x, ch_index). Given that the purpose of the function is to modify the data (as opposed to say plotting it) I think ch_index will be used a lot more than ch_name. Perhaps it's not too much to ask for the user to do a inst.ch_names[ch_index] to look up the channel name if they need to. Or is there some scenario in which it would be harder to look up the channel name? (dropped bad channels come to mind).

dominikwelke · 2023-11-21T18:42:42Z

just applied @mscheltienne 's suggestions. let me know what to do next

could we perhaps YAGNI a bit and only give the channel index, not its name? (...) Or is there some scenario in which it would be harder to look up the channel name? (dropped bad channels come to mind).

this would work, but yes - for the user it would be more dificult if they wanted to use ch_names. they would have to pull/create a list or dict with the channel names before or while setting up their def that must correspond to insts ch_names when eventually processed. this seems doable but quite error prone and i think is incompatible with rejecting channels between def declaration and calling apply_function.
in the end we would require users to keep track of something that is readily available within apply_function's scope (->inst.info)

if this was to be the desired route, i would rather consider only offering ch_name - this way ppl could directly access relevant information (e.g. scalp location or hemispere in case of eeg - if 'O' in ch_name: ...) and define more transparent/human readable dictionaries if they needed an int.

wmvanvliet · 2023-11-22T14:25:43Z

I understand the decision has already been made, but for the record, I would also be fine with just having either 1 or 3 positional parameters. So either f(x) or f(x, ch_idx, ch_name). The inconvenience to the user to have a possibly useless parameter passed to their function is so small it doesn't warrant forcing through an apparently controversial design decision.

drammock

this looks pretty good! a couple of suggested simplifications/cleanups below. Reviewing this makes me realize how much code duplication there is in these three methods; maybe after this one is merged you could try (if you want to) to refactor some of that duplication into a helper function that all three methods could use?

drammock · 2023-11-30T22:16:50Z

mne/epochs.py

        if dtype is not None and dtype != self._data.dtype:
            self._data = self._data.astype(dtype)

+        args = getfullargspec(fun)[0] + getfullargspec(fun)[4]


let's take advantage of the fact that getfullargspec returns a named tuple, to make this line a bit more transparent about what we're doing:

Suggested change

args = getfullargspec(fun)[0] + getfullargspec(fun)[4]

args = getfullargspec(fun).args + getfullargspec(fun).kwonlyargs

mne/epochs.py

mne/evoked.py

drammock · 2023-11-30T23:02:02Z

mne/evoked.py

+        channel_wise=True,
+        verbose=None,
+        **kwargs,


@dominikwelke do this too please

mne/io/base.py

mne/io/tests/test_apply_function.py

dominikwelke · 2024-02-03T20:16:16Z

hi @drammock and all
apologies it took me forever to look into this again..
i applied the requested changes

larsoner

Thanks for sticking with it @dominikwelke, looks good and implements what we (eventually!) converged on, so in it goes!

Co-authored-by: Mathieu Scheltienne <mathieu.scheltienne@gmail.com>

make apply_function aware of channel index

59aee04

dominikwelke requested review from agramfort, cbrnr, dengemann, drammock and mscheltienne as code owners November 14, 2023 12:28

dominikwelke added 2 commits November 14, 2023 13:32

add unit test

8bbc0c0

add logger entry

06c9dbe

mscheltienne reviewed Nov 14, 2023

View reviewed changes

mne/io/tests/test_apply_function.py Outdated Show resolved Hide resolved

Update mne/io/tests/test_apply_function.py

b8549ba

Co-authored-by: Mathieu Scheltienne <mathieu.scheltienne@gmail.com>

mscheltienne reviewed Nov 14, 2023

View reviewed changes

mne/io/tests/test_apply_function.py Outdated Show resolved Hide resolved

dominikwelke added 3 commits November 14, 2023 15:35

Update mne/io/tests/test_apply_function.py

80daed1

make apply_function aware of channel name

85e6f11

minor improvements

6f594a5

hoechenberger reviewed Nov 14, 2023

View reviewed changes

mne/io/base.py Outdated Show resolved Hide resolved

larsoner reviewed Nov 14, 2023

View reviewed changes

mne/io/base.py Outdated Show resolved Hide resolved

dominikwelke added 4 commits November 14, 2023 19:10

update docstring

5737b8d

implementation for epochs

5133c51

implementation for evoked

106b7b6

add channel_wise=False branch for evoked

f88ccd0

dominikwelke added 2 commits November 15, 2023 12:58

merge upstream

0b087b9

main changelog entry

a6cb38b

wmvanvliet reviewed Nov 15, 2023

View reviewed changes

dominikwelke added 2 commits November 16, 2023 12:09

merge upstream

85bfaae

enable access to ch_idx AND ch_name

6d35e19

dominikwelke force-pushed the issue-10034 branch from 32e379e to 6d35e19 Compare November 16, 2023 12:56

dominikwelke added 2 commits November 21, 2023 18:10

Merge remote-tracking branch 'upstream/main' into issue-10034

7c7ccf3

apply mscheltienne's reviews

541a36a

drammock reviewed Nov 30, 2023

View reviewed changes

drammock mentioned this pull request Dec 11, 2023

Extracting the time series of activations in a label #11922

Open

dominikwelke added 2 commits February 3, 2024 18:39

Merge remote-tracking branch 'upstream/main' into issue-10034

b2c79f4

apply Dan's revisions

da6c83a

larsoner approved these changes Feb 6, 2024

View reviewed changes

larsoner merged commit 87df00d into mne-tools:main Feb 6, 2024

snwnde pushed a commit to snwnde/mne-python that referenced this pull request Mar 20, 2024

ENH: make apply_function aware of channel index (mne-tools#12206)

97e8dcd

Co-authored-by: Mathieu Scheltienne <mathieu.scheltienne@gmail.com>

dominikwelke deleted the issue-10034 branch April 14, 2025 11:04

	args = getfullargspec(fun)[0] + getfullargspec(fun)[4]
	args = getfullargspec(fun).args + getfullargspec(fun).kwonlyargs

Uh oh!

Conversation

dominikwelke commented Nov 14, 2023

Uh oh!

dominikwelke commented Nov 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dominikwelke commented Nov 14, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dominikwelke commented Nov 15, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hoechenberger commented Nov 16, 2023

Uh oh!

hoechenberger commented Nov 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mscheltienne commented Nov 17, 2023

Uh oh!

cbrnr commented Nov 17, 2023

Uh oh!

hoechenberger commented Nov 17, 2023

Uh oh!

hoechenberger commented Nov 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hoechenberger commented Nov 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drammock commented Nov 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cbrnr commented Nov 17, 2023

Uh oh!

hoechenberger commented Nov 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hoechenberger commented Nov 17, 2023

Uh oh!

drammock commented Nov 17, 2023

Uh oh!

hoechenberger commented Nov 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hoechenberger commented Nov 17, 2023

Uh oh!

larsoner commented Nov 17, 2023

Uh oh!

drammock commented Nov 17, 2023

Uh oh!

hoechenberger commented Nov 17, 2023

Uh oh!

wmvanvliet commented Nov 20, 2023

Uh oh!

dominikwelke commented Nov 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wmvanvliet commented Nov 22, 2023

Uh oh!

drammock left a comment

Choose a reason for hiding this comment

Uh oh!

drammock Nov 30, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dominikwelke commented Nov 14, 2023 •

edited

Loading

hoechenberger commented Nov 17, 2023 •

edited

Loading

hoechenberger commented Nov 17, 2023 •

edited

Loading

hoechenberger commented Nov 17, 2023 •

edited

Loading

drammock commented Nov 17, 2023 •

edited

Loading

hoechenberger commented Nov 17, 2023 •

edited

Loading

hoechenberger commented Nov 17, 2023 •

edited

Loading

dominikwelke commented Nov 21, 2023 •

edited

Loading