Conversation
I think you're referencing the wrong test. The test that's failing is on line 4. Indeed, appending |
|
Line 4, not the fourth non-commented test; got it 😅 |
0da8161 to
a57c888
Compare
a57c888 to
3affd5a
Compare
Multiple occurrences of "-Wunused-but-set-variable" that appeared on Mac OSX GitHub CI.
3affd5a to
6b9ed64
Compare
|
Two tests remain failing even with fairly relaxed tolerances:
Would be thankful if someone could generate these on an ARM64 machine and forward the data so that I can compare them to what mine generates: would prefer to see the differences on these rather than naively dialing back the tolerances further. |
|
|
Ok, so I have spent a frustratingly long time trying to understand why these precision issues arise in the first place. After playing around with the code in
Other projects also seem to be affected by this problem, e.g. see here. @MRtrix3/mrtrix3-devs should we just turn off FMA operations for ARM64 Mac chips? |
- For mrcolour, output images are clamped to [0.0,1.0] range, so use absolute tolerance rather than fractional. - Increase tolerance on tckresample -step_size given the imprecision of the resampling operation itself.
|
Thanks @bjeurissen; very helpful.
|
|
Ooofffffff, race condition 🤪 Personally I'd say keep this PR. The tests should ideally pass on any system where the code has been compiled, which means both different hardware and different compiler flags. So some amount of tolerance to account for such is reasonable. We're not dealing with differences of a magnitude that is consequential for the respective operations to be performed on user data. If it were a bigger problem we could discuss generating test data with FFP on, and either using that as reference with tolerance for the errors with FFP off or storing outputs with both configurations and comparing generated data to both references, bu I don't think it's worth the trouble here. What's the respective timelines on compiler support for FFP contractions vs. hardware support for such opcodes? If it's both potentially faster and more precise, there's an argument to be made to, rather than disabling such to have less inter-system variance, instead enable such across the board, including for precompiled binaries. But only if they're long-standing x86 operations that compilers are only now starting to utilise. |
If we believe the new tolerances to be more sensible, I agree. My only concern here is that the differences in results between ARM64 and x86 chips could arbitrarily propagate to large values (e.g. in a for loop that involves floating-point operations). If this is the case, then our commands can give significantly different results depending on the platform. I guess we can deal with it when we encounter an issue of this nature.
Any modern x86 CPU (e.g. any CPU released in the last 10 years by Intel or AMD) should support FMA operations (C++ even provides implementations in the standard library). Both GCC and Clang now enable these optimisations by default. |
Addresses #2885. Working off the linked CI errors there.
Targeting
masterhere; will need to either merge frommasterback todevor re-implement there.dwidenoiseshould hopefully only require a minor increase to tolerance.mrcolourtests had no tolerance whatsoever; so would only take a minute difference in floating-point rounding to set off.tckresampleI was a little surprised by given that there's a default tolerance of 1e-5 mm. It is however possible that because of the operation of the two problematic resampling strategies (fixed step size and fixed number of points) floating-point imprecision could accumulate. Will have to wait and see if a 10x increase in tolerance is sufficient.vectorstatsfailure: the corresponding test very clearly shows use of-frac 1e-6for that call, but the error message claims an absolute tolerance of zero. Will see what gets generated from this PR.