BUG: Fix bugs with corrmap computation#11858
Conversation
|
@jona-sassenhagen any chance you have time (& inclination) to review this one? |
jona-sassenhagen
left a comment
There was a problem hiding this comment.
Interesting. I have no recollection of how I originally implemented this. The code looks horrible, the names are so hard to read! I think taking the median was probably intentional, the problem with normalization was probably just a bug that I never discovered because the results were looking sensible.
The original EEGLAB/matlab code is this btw:
%%%%%%%%%%%%%%%%%%%%root mean square normalization %%%%%%%%%%%%%%%%%%%%%%
rms=sqrt( mean( ( comp{ind_c(g)}(:,corr_dec_abs(g,2)) ).^2 ) );
sel_ic(:,g)=(comp{ind_c(g)}(:,corr_dec_abs(g,2)))./rms;
ttt=find(-corr_dec_abs(g,1)==all_info(:,1));
Thanks for the fix Eric.
| # Which is this (rms=Frobenius norm=np.linalg.norm): | ||
| newtarget = np.mean( | ||
| [(m * p) / np.linalg.norm(m) for m, p in zip(maxmaps, polarities)], axis=0 | ||
| ) |
There was a problem hiding this comment.
Can't this be written in vectorized form that would also be more readable? Like
(np.array(maxmaps) * np.array(polarities) / np.linalg.norm(np.array(maxmaps))).mean()or something like that.
There was a problem hiding this comment.
I think that might work if you pass the right axis (and maybe keepdims=True) to np.linalg.norm
There was a problem hiding this comment.
Maybe with the right axis manipulations... I'll try it and you can see if it's better
There was a problem hiding this comment.
... actually I'm not sure this will work unless we're guaranteed non-raggedness of the np.array, which we might not be if there are different numbers of components (which I think could happen if the number of PCA vectors differs across subjects for example). But I can unify the iteration step at least to a single one, which I'll push.
There was a problem hiding this comment.
yeah we cannot assume non-raggedness (IIRC I think raggedness even happens in our tutorial docs on corrmap)
There was a problem hiding this comment.
In that case I think what's here is about as readable as you can get with as few temporaries in memory (just one for the current sum plus one that is being added)
|
Thanks @larsoner |
Pretty sure from going back to the original paper that our
corrmapcode has been EDIT: a little bit incorrect ever since it was added in 0.10 in #2104 (which took over #1985, which took over #1795 -- the bug is present in all). We used the mean of some weirdly transformed maps instead of just the mean of rms-normalized maps like noted in the paper. Our code inmaindid something like a z-score but not quite, even when fixing the incorrect usage ofstdtomean.Fortunately I don't think it makes a big difference -- at least in our tutorial and tests the
newtargetin this PR and thenewtargetonmainhave a correlation coefficient > 0.99. But it should be better to use the correct / reference implementation I think.Looking at the paper, one difference still remains where we use the median correlation (across ICs) rather than using a mean with a Fisher z-transformation, but I think that part's okay to stay as is -- especially since some of our correlations can be near
1.which will be a problem when subjected to a Fisher Z transformation. So I think the existing use ofmedianis a justified departure for us.Closes #11858