Fix SH precomputer and update associated thresholds so existing practices and pipelines don't break.#1228
Fix SH precomputer and update associated thresholds so existing practices and pipelines don't break.#1228jdtournier merged 29 commits intodevfrom
Conversation
…h of the pull request to dev, so it seems.
|
The insignificant commit is to be able to set the base branch to |
|
Finally, note that automatically updating this with |
|
Ok, we're set up again, ready to tackle the thresholds. |
|
At the moment, the thresholds are untouched (0.1). Conclusion being: I suppose only the This is great news, and a big relief for all FBA pipeline users... the only change that is needed here ends up being the one unrelated to the SH precomputer fix: just a lower number in the documentation of the FBA pipeline, in line with our experiences here across a broad range of FBA studies. But the behaviour before and after the fix for the FBA pipeline remains the same! So as to I'll patch up the FBA documentation with the unrelated change to the threshold advice, the way we've been doing it here for all our studies since several years now. Only remaining thing on the TODO list (if the TODO is complete of course) is then |
|
Ok, minimal adjustment of the docs done in line with experience. I've been wanting to unify both SS and MS FBA pipelines into a single one, again in line with experience. We've got a few better practices for both pipelines around here that can unify both. But I'll leave that to a separate feature branch, as that's indeed independent from this. Again, what I think only remains now here is to check |
|
A quick test reveals we're not as lucky with |
This means that if the maximal amplitude value sampled for a lobe from the dixel set is just below the threshold, but Newton revision on the sphere yields an amplitude that is just above the thresold, the fixel survives the threshold test. Also removed multiplie redundant calls to FOD_lobe::finalise(). Identified from discussion in #1228.
Consistent with the code as it stands: Peak amplitude test occurs before Newton revision of peak, but only the latter uses the precomputer. Have modified so that the peak amplitude test occurs after Newton revision; unlikely to make much of a difference though (without the precomputer bug that is; with, it would make a difference).
No, |
👍 Makes full sense to me now. About the latter change:
I'm not up to date with all the details in the FMLS code, but just checking to be sure: are we certain this doesn't introduce problems on other fronts, such as the
👍 Yep, all clear to me now. |
The FOD lbes are segmented and the integral computed based on the amplitudes in the 1281-direction dixel set. So it's impossible for a fixel to not have any dixels within it.
The Newton optimisation has always occurred, regardless of whether
While I don't know exactly how much time this contributes, I doubt it would be the dominant factor compared to the SH2A transformation / level set evolution. Unlike
It makes sense to me to apply the peak amplitude threshold to the peak amplitude as it is most accurately estimated, not the "nearest guess" that the FMLS obtains. It would also be a bit of code management as to whether or not that peak revision takes place, given it may be triggered not only by use of the
Yeah I suppose that'll happen if the application of the threshold changes. But given this:
, I'd suggest the test data should be re-generated.
I think I'd advocate leaving that the way it is. No point in putting effort into de-sensitising tests to changes if the high sensitivity is not causing a problem.
Is this purely for hard threshold CSD? Would we still advocate 0.1 for soft-threshold? |
Ah, ok. I had failed to appreciate that parts of the lobe below the threshold are also still part of the fixel. In that case, no worries indeed.
Ok, so compared to before, definitely no additional computational load. That's all good then. It'd indeed not be worth the effort and extra complication to make it so, especially not if the computational load of the Newton optimisation isn't expected to be significant. Happy to keep things as is then.
Oh yeah, of course. But they were already re-generated by @jdtournier when the fix was implemented (hence the tests succeeding until right before the most recent commit). But they'll have to be re-generated again after this commit, due to such a "dimension change": essentially a few fixels that now succeed to make the threshold, but didn't before.
Fully agreed; I wasn't aiming to change this actually, it was just an observation that I hadn't realised before.
It's actually a compromise, but a slight shift in favour of multi-tissue / hard thresholded CSD nonetheless. I once proposed a change from 0.1 to 0.05 before, which would've been entirely in favour of the "new" CSDs. However, with the current (almost) overall amplitude changes, there is compensation towards the other end of the spectrum. The coincidentally nice thing is that this allows for a compromise that is still a nice "round" number: 0.05 instead of something like 0.07 or 0.075 or similar. |
|
Here's an example. Nice multi-shell data and multi-tissue CSD. Tractography seeded from the brain mask and everything else defaults (except An OK result, until you look at the details and notice it struggles to reach the cortex in some locations. This could already mean (some) trouble for connectomics (and ACT), and surgical planning if inclusion ROIs are rather strictly confined to the cortex. Increased risk in presence of tumours or lesions, where this could easily lead to connections disappearing, even though they were merely infiltrated by the pathology. Hence my arguments in the past to lower the default already, but I failed to gather enough support for it. Of course, now that the bug fix results in a 35~40% amplitude change (unless the orientation is very, very closely aligned with the z-axis) that is tested against the threshold, one can imagine what's going to happen... Here it is: This is really not acceptable. Entire parts are untracked, and streamlines fail to reach the WM-GM boundary in several significant chunks of the cortex. So something needs to change. My proposed new default threshold of 0.05 would result in this: This overcomes the overall changes brought about by the bug fix, and pushes a little bit further to be somewhat more inclusive. In many, if not most, scenarios I'd still push quite a bit further, but as a default, this isn't so bad. Here's what you'd get with the current bug fix and my proposed 0.05 default threshold if tracking on FODs from soft-constrained SSST-CSD: (Note that the background is still the multi-tissue result here, so it's easier to directly compare the difference in the streamlines tractograms.) Would I still advise 0.1 here instead? Well, actually not. To compensate for the bug fix, a threshold of ~0.065 or something would be reasonable. But even then, there's actually no harm at all in using 0.05 for all typical scenarios. If doing connectomics, you've got ACT and parcel assignment anyway to cut short streamlines. If segmenting bundles, e.g. surgical planning or ROI based analysis, you'll typically have 2 inclusion regions on both ends of the bundle at least, which can be augmented with a mask or exclusion regions to deal with a few spurious streamlines. Or even easier, you'd compute a TDI of the bundle defined by only inclusion regions, and threshold the TDI directly (which is what we're doing around here now for studies that need anatomical WM ROIs). Trying to go just of a threshold, you always loose with soft-constrained SSST-CSD anyway, as when you raise the threshold, you start loosing connections through certain crossings. The 3-way crossings first, which sit at important intersections of major bundles. So really, unless your application is making a desktop background or printed poster of a whole brain tractogram for purely artistic purposes, and you insist on not-using ACT, and you insist on using soft-constrained CSD, and you refuse to optionally look at your threshold yourself a bit for the sake of an artistic cleaned-up output that is a non-critical scenario anyway... well, there's nothing reasonable against an 0.05 threshold at this stage (taking into account the bug fix), I'd say. For any other scenario, one can already resort to at least 2-tissue CSD anyway, which brings in the hard constraint naturally. Finally, in the spectrum of all scenarios, even soft-constrained SSST-CSD ones, it's almost always still better or safer to be inclusive, rather than risking false negatives. It's the case for connectomics, and segmentations, and in particular for surgical planning. |
|
Apart from having to change the default http://community.mrtrix.org/t/cutoff-value-in-tckgen/1360 ...I was reminded about this: (quoting myself from some posts above)
We'll have to be careful documenting the different defaults. We could even consider creating separate Not per se necessary, but worthwhile considering. I don't feel strongly about it. But I do feel strongly about the actual default value for the FOD cutoff: that really needs to be fixed along with this pull request. |
|
@Lestropie : something seems to be wrong with commit 849b475. Example using a little test FOD image (small region of a template from a collaborator I was QA'ing yesterday; but the problem happened around the whole brain): For the test case below, I generated fixels using the code just before that commit (i.e. checking out e3c6850):
...resulting in: This is all good and makes sense. I chose a high threshold of 0.8 for this example, so you can easily visually get an idea of the order of magnitude of the threshold (from the selected fixels in the image, versus the ones that didn't make the cut). Running the same line of code (i.e. Note that the newly appearing fixels don't make sense, given the order of magnitude of the threshold... Zooming into the middle voxel to see what's going on: ...and from another angle: |
|
OK, clearly I'm being too lenient on the Newton optimiser... it's jumping from seeds on those small peaks onto the adjacent large lobe, and hence those tiny ones are surviving the peak amplitude threshold. The revised peak will need to be tested to ensure that the dixel closest to the newly-selected peak is a member of that particular lobe. Bit of fiddling, but shouldn't be too difficult. |
When the initial FOD peak identified via the FMLS process is then revised via Newton optimisation, apply additional constraints to decide whether or not the outcome of Newton optimisation should be trusted: - The dixel to which the new peak direction is closest must be a part of the same FOD lobe. - The new peak direction must be closer to the initial peak direction (that was used as the seed point for the Newton optimisation) than to any other FOD peaks within the same lobe. Addresses issue introduced in commit 849b475 as reported in #1228.
|
OK, I have a working script that:
So, this will give the ultimate suite of data with which to choose what we want to do taking into account a range of use cases. Hopefully tomorrow I'll post one mighty big image 😈 It would also be worth having some side-by-side comparison of tracking with & without the bug fix, which might show not only the difference in effective threshold but also the Z-axis bias. |
|
Matrix of tracking outcomes: Full-resolution image here. Don't know if this provides the specific information we need; if there are any alternative ideas (e.g. generating a similar image matrix for targeted tracking), those can be investigated also. |
|
Also because I tried to fit the thalamus into the FoV / didn't want to zoom out too far / didn't show the DWI-based brain mask, can't see whether or not streamlines are making it to the edge of the mask when ACT isn't used. I'll keep playing around, but ideas more than welcome. |
|
Also: While we can't trivially alter the default threshold based on FOD reconstruction algorithm, we can trivially alter it based on the presence or absence of ACT (if we choose to). |
|
Second attempt: 0.01 increments, used GM seeding, and more tracks. Full-resolution here. |
The default threshold for tracking is now dependent on the particular algorithm selected.
|
Even though |
|
Yep, that's all (still) deliberate. Those altered recommendations (compared to the default value) take into account the fact that it runs on a template at that stage in the pipeline. For single-tissue CSD at that stage, a slightly higher threshold is safer to not end up with lots of false positive fixels in the GM (tractography, on the other hand, isn't that much affected by all of those, since many don't align with the nearby WM). For multi-tissue CSD, we've always tuned these manually, and I've assisted different people manually with their template fixel mask at this stage. 0.06 came out as a safe threshold, that also lands the number of fixels in an acceptable/realistic range. This has/had to be lower than the subject thresholds also for the reason that the template typically has lower FOD peak amplitudes. For subject segmentations, again we want to be a little bit more careful compared to tractography, since all those fixels (may) end up in the analysis; at the same time, we can comfortable be safer than tractography, since in FBA we mostly need the more robust core WM. Due to variance of cortical topology, we can't find significant results there anyway most of the time; so it's not that problematic that no fixels at the subject level are extracted all the way up until (in) the cortex. In general, in tractography we've also got other constraints (such as curvature) that can inherently deal with a noisy peak here and there, as long as it's not consistent with surrounding structure. For most applications where we segment fixels individually per voxel, we may want to be slightly more aggressive on cleaning up noisy peaks. Long story short: those template recommendations are definitely optimised now, based on a wide range of templates in different studies... I definitely wouldn't touch those. Furthermore the updated docs for both FBA pipelines have a big warning box at that step, which explains further (to great extent) how to deal with those values. The respective values are very good starting points, that seem to generalise very well, but the advise there is again to go hands-on for the best result. As to the default value used within |
|
Random thoughts:
This is probably too sensitive to both the FOD estimation algorithm and the context in which |
|
Even taking all those bullet points into account, looking at those results, I'd definitely still happily stick to 0.10 there. Even for 3-tissue CSD, I already see a (few) false positives using 0.1, and more for 0.05. For 2-tissue and single-tissue (soft) CSD, things really become too crazy, I'd argue. Side note:
You'd be surprised how easily that happens, especially in voxels with genuine 2- or 3-way crossings. The purpose of Happy to keep things the way they are for |
|
@MRtrix3/mrtrix3-devs , any further opinions? If not, I reckon we're done here. |
|
This is entirely ready to hit the button. I'm not at work now (ANZAC day down under), but unless anyone comments in the next 20 hours, I'll consider this good to go. If anyone's not happy/comfortable with that, speak up. |











Continuing work from #1206, which wasn't completed yet.
Relevant:
2 dot points from this post: #1206 (comment)
This list of dot points: #1206 (comment)
For convenience, here's the latter:
Number (1) is done, the rest is not. We can't deploy this fix without addressing these others, or behaviour will be changed significantly and in an unwanted way, breaking people's pipelines' intention. I will comment further on the thresholds next week.