Conversation
Manually ported from fixel_twi_0.3.15 branch.
Previously, all images in the fixel directory were opened, and then is_directions_file() was called, which checked both the image properties and the file name. Apart from being inefficient, this also clogs -info output as the directory is scanned. This change validates that the name of a file is a plausible directions file before instantiating a Header and checking the image properties.
Functions for T-tests were moved into individual GLM classes, to reduce confusion with respect to pre-scaling of contrasts and transposition of matrices.
…e design matrix experiments
- Incorrect definition of non-const references to global matrices during calculation of default statistics resulting in zero stdev and std_effect_size. - Incorrect calculation of degrees of freedom in GLMTTestVariable::ttest() resulting in infinite variance and zero t-values.
NOTE: Not yet tested Detects the presence of non-finite values in either the input data, or element-wise design matrix columns, and omits the relevant rows from the linear regression. Currently only implemented for fixelcfestats.
Provides functionality for edge-wise explanatory variables, implemented in the same manner as was done for fixelcfestats.
Performing substantial re-write of connected components filter in order to make roles and functionalities of different code more clear.
The roles of different parts of the code have been made more clear by encapsulating them within appropriately-named classes. The functionality of mapping each voxel position to an index in a 1D vector of data now resides in the Voxel2Vector class, placed in core/misc since it is used both by the connected components filter, and the statistics code. core/bitset.h has also been moved into this new core/misc/ directory, since it's just a utility class that isn't as fundamental to the compilation of MRtrix3 as the other contents of core/.
Functionality for voxel-wise explanatory variables, just as has already been implemented for fixelcfestats and connectomestats.
Note: None of this code has been tested whatsoever.
… into stats_elementwise_design_matrix Some tricky merging happening here, since adding multiple contrast matrix row support conflicted quite heavily with other changes that had happened in this branch e.g. support of NaN values in the data.
- Provide a text string describing the fact that a column of ones is not automatically added to GLM design matrices, and add this string to the DESCRIPTION field of all statistical inference commands. Also fix up some code comments regarding the purpose of this column. - Provide a test for vectorstats; its output is not yet tested however.
- Modify how multiple contrasts are handled. Initial support for this was previously implemented for vectorstats only, based on the contrast being a matrix rather than a vector. However, this framework does not naturally extend to F-tests. This new approach stores each within a GLM::Contrast class, and these are explicitly looped over whenever required. GLM t-test code has been replaced with the F-test (i.e. without variance groups) as presented in Winkler et al., 2014. This has removed many optimisations, but was necessary in order to sufficiently generalise the framework for upcoming enhancement. Changed stdev calculations in statistical inference commands to a vector, since this does not change betwee contrasts. - Removed apparent redundant creation of output images related to the default permutation in mrclusterstats. - Model is currently partitioned based on null columns of the contrast matrix only. Note that this is merely the first COMPILING version of this code.
Conflicts: cmd/connectomestats.cpp docs/reference/commands/connectomestats.rst testing/data
Fixes operation of vectorstats test number 3, where the model is "partitioned" despite possessing no nuisance regressors.
Rather than randomly generating test data for vectorstats input, use pre-generated test data known to pass tests for statistical significance. Additionally perform precise tests of non-stochastic outputs against pre-generated output data.
|
🤞 |
|
Hi @Lestropie
I checked the text file list_of_subjects.txt and it only contains the identifiers with the .mif extension, so the error is not because of the content inside that file. At this point I am not sure if this is the expected behavior and the list file is supposed to be inside the input fixel directory or if this is some kind of bug. |
|
I believe the problem is that the directory location of the subject list text file is being used as the reference location, rather than the current working directory. The bit of the error message that says " Moving I probably want to remove this line of code entirely. |
|
Hi again! This happens when I run the fixelcfestats command with a conservative mask containing 1896 fixels with the -notest flag. First, I thought it was a memory issue but it happens exactly the same way in a machine with 256 GB of available memory. when I ran the command with gdb: and nothing else happens, so I tried it with valgrind: But honestly, I have no idea if this error is due to insufficient memory or something related with the Eigen library |
Did you type " One possibility is that the number of fixels is not equivalent between the various input fixel data files. Regardless of which fixels are or are not included in the mask, the sizes of the fixel data files still need to be equivalent to both the number of fixels in the index image and the fixel mask image. If this is the case I can add a check so that the error message is more informative. |
No, I didn't 🤦♀️
I checked the number of fixels and it is the same for the subjects' input data and for the fixel mask, I mean, when I run |
As part of #1693, fixelcfestats was modified to construct the subject data matrix based on all fixels, and then restrict processing to only those fixels inside the mask by filling subject data outside of the mask with NaNs and propagating the mask information to various functions. Previously, an index remapping was performed so that the subject data matrix would contain as many columns as there were fixels in the mask, and so input fixel indices would need to be projected to internal fixel indices. It appears as though during this change the allocation of the subject fixel data matrix was not properly updated to reflect its requisite larger size when the -mask option is used. Reported in #1543.
|
@diagiraldo: Please see #2022. |
Since there's plenty of demand for it, I'll see how I go getting this incorporated into RC4.
Currently there are new unit tests for
vectorstats, which primarily test the internals of the GLM. Generating CI tests forconnectomestats/fixelcfestats/mrclusterstatsis feasibly possible, just haven't put the energy into them; unlike the GLM internals the relevant code hasn't changed a great deal from old code anyway.Major things that are in here:
GLM:
Freedman-Lane method rather than
Shuffle-XShuffle-Y / Manly.(Edit: While in the code it was the design matrix that was shuffled, the effect of interest vs. nuisance regressors were not separated, and hence the empirical behaviour was equivalent to instead shuffling the data, which makes it Shuffle-Y).
Multiple hypothesis testing (i.e. contrast matrix as opposed to vector).
F-tests.
Per-element design matrices; can have:
Nuisance regressors, or indeed effects of interest, where the value is unique for each fixel / voxel / connectome edge tested;
Non-finite subject data, in which case the relevant design matrix rows are scrubbed prior to inverting the model.
Sign-flipping in addition to / instead of permutations; enables one-sample t-tests.
fixelcfestats:Normalised form of CFE equation. Provides effect comparable to non-stationarity correction without computational penalty. Note however that the recommended FBA processing may need to change here: both this mechanism and permutation-based non-stationarity correction are sensitive to masking.
Fixel-fixel connectivity matrix generation ~ 4-5 times faster and requires ~ 4-5 times less RAM.
Multi-threaded other stages of processing.
New commands:
fixelconnectivity: Generate a fixel-fixel connectivity matrix (and optionally a second matrix intended for data smoothing), and write to file. Current file format used does however incur decent performance hit on load/save.fixelfilter: Perform filtering operations on fixel data. This includes smoothing, and a preliminary connected-component algorithm. Can operate on individual fixel data files, or all data files within a fixel format directory.Use of the two commands above can be followed by running
fixelcfestatsbut providing pre-smoothed fixel data and a pre-calculated fixel-fixel connectivity matrix. Iffixelcfestatsis provided with a tractogram file as per current usage, it will generate the connectivity matrices and perform fixel data smoothing during data import as per current behaviour.