Conversation
…issue images with additional bias field correction
* Also minor clean-up
* Add an independent max-iter for inner-loop scale normalisation * Support single-tissue input * Perform a coarse initial outlier rejection before main iteration * Remove independent option * Correctly applying scale-normalisation/bias-field in output of 4d-images * Include 'log_norm_scale' in header of output images
* Looking at difference between mask affected by outlier rejection
…alisation factors * Make sure we compare prev_mask to current mask before we cache the current mask. Genius!
…ke sure we avoid stride mismatches
* Elevate all input images to 4d via replicate adapter
…e are important fundamental reasons why the method is not producing the results it was supposed to. mtlognorm is a new method; developed, validated, implemented and debugged from scratch. Have discussed this way of proceeding and replacing mtbin at length with Alan; he fully supports it. The last thing we want to see happen, is people producing very suboptimal FBA studies; which was a major risk with the mtbin in the pipeline. Sorry if people had expected more discussion around this, but it just wasn't happening fast enough, and we're under massive time pressure over here at the moment.
includes reintroduction of dwibiascorrect in the multi-tissue FBA pipeline: I've found in several studies that this has an important impact on both mask as well as response function estimation (the latter applies specifically to the dwi2response tournier algorithm). I've also changes the ANTS parameters: the new ones greatly reduce the corrected intensity of non-brain bits in the lower frontal regions of the head, which again greatly aids the mask estimation so it doesn't include too much crap.
I don't see this change in the diff... Also: This is still being applied as a peak amplitude threshold... has an integral threshold been tried instead?
Consider instead having I would however fix the basis function error in
That's probably the only one that's likely to go with it. We haven't even created a development branch yet for accumulating updates, so adding anything else would involve finishing and combining multiple pull requests; and I don't think it's worth delaying this trying to bundle other things to go with it.
If the issues are of the severity that you're describing, I'd have preferred that an announcement be sent out before a replacement was ready. If results generated using this command had the potential to be worse than what would have been produced using what the recommended pipeline was before Command name (comments)
I don't see that it's any different to
These demonstrate the issue above. For instance, we have Command name (proposed)Having gone through the command and compared it to the If we consider those features that are common between this command and
Those features that differ between the two commands:
So while I appreciate that you want to emphasize these differences in order to justify the importance of the changes / the magnitude of your contribution / the claim of command authorship, ultimately this command has more in common with "the MTBIN method" than it has differences. Therefore, to demand absolute separation of this command from OK, that's the nasty bit done. Now onto some constructive input. I advocate the following compromise: Rename this new command to This:
Finally, some code snippets I didn't change but wanted to leave a comment on, putting them here so that maybe they provide a little cooldown period from the above:
Provide access to this as a command-line option?
These tripped me up initially. If I'm going to sum a set of values, then take the logarithm of the result, I would intuitively call that "log_summed". "summed_log" may be more consistent with what I see people doing with file names in their pipelines, i.e. appending the most recent processing step to the name. Not sure if this is a variable personal preference thing, I just found it weird.
I wouldn't run
Worth having as command-line options?
Is this going to need to be regressed against? Or should it be incorporated into scaling the output images? |
Because the lower quartile iterator was used as the starting position for the second nth_element() call, it was possible for the second invocation to change the value underlying that iterator before it was read.
…nput, intensity normalisation computations and writing output
…se a percentage target
|
See below an introduction into / overview of the structure of Users
I also hope the above makes it clear it's not easy (in practice, for me) to talk about both algorithms with respect to each other, since it's not just the order or looping structure of some boxes that is different, or even on top of that the log-domain in certain crucial definitions of those boxes, but also in a way the meaning of what our boxes are after, versus Finally to add some insight to how step 2 came to be added to our design at some stage (you can track this down in the commits at some point): before that one, the chicken and egg problem that kept us up all night was the order of 3 and 4. Once the thing's running smoothly, that order doesn't really matter, it's then just about getting the pair 3-4 right, and iterate optimising that pair with 5. But at the start, we face a dilemma. In a decently behaved scenario, no worries. But:
So both scenarios are rare, and it's a bit hard to guess which one is more likely when we integrate over all use cases done by actual users from now until the end of times. So it still kept us up at night, and deciding the order of 3-4, while irrelevant for most scenarios, became a small obsession (from my part mostly, @rtabbara is probably more reasonable on this front 😉 ) to get right. Adding step 2 is the compromise between both: we do detect outliers first, but we do it much more relaxed in that step. So the tissue types have to be unbalanced up too extremely silly extents for the above horror scenario to unfold. That's all the reasoning there is to that specific one, which, as mentioned, was a final touch to the algorithm when the rest was established at that point. Ok, the above documents |
|
OK, I just rebased this to merge onto the As to this branch, I guess the only thing left to discuss is the name - feels like we agree on just about everything else.
|
|
We have done various testing using:
For most, both |
|
Ultimate fail case. I picked one of the 40 subjects of the above first bullet for this. There were 3 such "ultimate fail" cases among those 40. Everything outside the mask is black, inside the mask is jet colour scale. As you can see, there is a little bit of crap attached to the mask at the front. This happens every once in a while; but the bit was really small in this case.
Look at the range for insight into the problem at the crosshairs. Maybe we could iron that out, one might think? Let's try and feed these outputs into Yeah, no that is. This independently also shows Next up, All nice and good. We may wonder if we really converged as good as it seems. Let's throw the outputs at It definitely seems to ultimately settle, with from outer iteration 5 onwards literally no more changes in the outlier mask. Note that in between each of those lines, both the spatially variant normalisation as well as subsequently the rebalancing are given a go (see diagram in other post), and still this doesn't cause changes in the outlier mask. The normalisation field as well as the rebalancing are still co-optimising though, but the changes are in the realm of numerical insignificance. Interesting observation: why does the algorithm still seem to go so happily in those first 4 iterations? Well, simple explanation really: the initial weaker outlier rejection sets the initial rebalancing off to a slightly different start, compared to what it was "used to" and had settled on during the last iterations of the previous run. But it quickly finds its way again during iteration 1 itself, and the subsequent iterations are just for obsession's sake. To convince us of that, I've output the actual normalisation field of this run as well. Here's the stats on that one: Beautiful. And just because we're overly confident, let's check that for the entire field of view, even spatially quite far beyond the actual mask: Booyah. Note this is far from trivial, and actually quite exceptional: we all know how a polynomial behaves beyond the domain of data that it was fit to. The edge voxels, especially those in that crappy extra bit to the mask, risk having all the influence to still have this polynomial go to very low or high values by the time it reaches the edge of the field of view. But it doesn't; it's marvellously flat. Finally, just for fun, let's try and give Excellent. It even settles on close to the same balance coefficients that it did for the input data that was not first mangled by Still a significant correction versus the original output of That's still quite a significant relative impact in the brain area itself as well, with relative differences between core brain regions that exceed 10% easily. One of the main things I worry about with respect to |
|
I will do those replies to other comments later, as I need to catch a train before they stop running today. In the mean time, just to quickly add the gist on the name thing: I'm very much for |
Change is at: https://github.com/MRtrix3/mrtrix3/pull/1036/files#diff-eb4aa7a31236ad6b638bffa6a4485951R129 Peaks vs integral: in these scenarios would practically make no difference at all (apart from maybe a few random differences at the far extents of the WM (essentially already in the GM), again due to the template being that clean. But now thinking of it, actually "peaks" is also not that bad of a choice: we're essentially after things that are quite certain to be genuine fixels. Peaks have the actual benefit of (especially in a template) being a measure that also factors in certainty, as peaks of subjects that were less certain or less well aligned across the template subjects will be lower in peak amplitude. I know it of course also factors in dispersion, but even that could in a way be seen as less "certainty of fixel". There's definitely arguments against this reasoning too; but so I feel peaks isn't really that bad per se for this particular step. Overall, I wouldn't be too fussed about it personally. Peaks does a pretty good job in
This has now been implemented, with a lot of warnings here and there to make any user think about it carefully.
All good.
Outputting the weighted sum image doesn't really help much for better understanding in any scenario I could think of. I don't want to confuse users with this. But users that do want to see this (and then probably understand the algorithm very well, so they can interpret the weighted sum image in context), can simply Allowing changing the outlier fences is also not really needed I reckon; unless a user would at some point come up with a compelling case of why they'd need to, I wouldn't confuse them with it. The numbers chosen here (1.5 and 3.0) are pretty standardised definitions for outliers and extreme outliers (used in e.g SPSS and other packages for boxplots too). They're also non-parametric, so quite safe to use. It doesn't really matter how much the incoming bias fields vary spatially, eventually the algorithm converges and the outliers are only relative to the corrected image at that stage, and end up indicating outliers that could also not be fit by the polynomial. Finally,
I should rectify this: there has been no intent of abusing the name for IP purposes. In fact, I think it would be unwise to decide command names via IP whatevers: the command names' primary purpose should be to indicate what the command does, rather than err towards an encrypted/obfuscated acronym with version numbers encoded. As @jdtournier mentioned, there wasn't even a published acronym for the method here to begin with; and if I ever write/publish something about this one, I won't be calling it anything "BIN2". If anything, this is a great opportunity to choose a more clear name. The reason why I was to insisting on not going under the Only annoying thing about the name change right now, is that Not wanting to add a nasty bit of my own, but just being very objective here by the way: Well, just my 2 cents. But for all intents and purposes, I think the command names should just be as clear and simple as possible. "normalise" still only has 9 characters, and "mt" is rather short too. There's a lot of much longer MRtrix command names out there. |
|
Alright, that's it then, finally. I will merge into the tag branch, so don't panic please. 😅 If there's still discussion to be done, there's plenty of opportunities still. |
🙄 |
As in: actually considering changing those names (not bringing up the former facts). I don't think these kinds of "in principle" arguments should drive name changes. For me, a command name is part of the user interface, which should be designed just for the purpose of interfacing with the user. Communicating history can be done in manuals, appendices, ... but should not get in the way of the primary goal of the software: to be used. |
|
OK, this branch is already merged with |
👍 |











I would personally encourage to do away with
mtbin, as there are important issues with it. I would personally encourage to update the docs accordingly, such as in the way offered by the work we did for this pull request. I would personally encourage to reintroduce dwibiascorrect in the pipeline any way, because I observed important impact ondwi2maskanddwi2response tournierin realistic cases. I would personally encourage to adopt the lower threshold for the template fixel masks that I have suggested as a docs change in this pull request, since 0.2 has been found to be consistently inappropriate in a lot of studies now, and this impact is very important on CFE, which affects other fixels even if they are in the mask at the higher threshold.The reason that I retained
mtbinbefore, but made it provide the warning/error, was to in the mean time alarm users that they should probably switch.