FIX - FMRIB's ICA-based Xnoiseifier (R/MATLAB version)
This page describes the old R/MATLAB version of FIX. This version is no longer supported - for details on the new Python version which is installed as part of FSL, head to this page.
The R/MATLAB version is still available however - installation and usage instructions can be found below.
Referencing
If you use any version of FIX in your research, please cite these papers:
G. Salimi-Khorshidi, G. Douaud, C.F. Beckmann, M.F. Glasser, L. Griffanti S.M. Smith. Automatic denoising of functional MRI data: Combining independent component analysis and hierarchical fusion of classifiers. NeuroImage, 90:449-68, 2014
L. Griffanti, G. Salimi-Khorshidi, C.F. Beckmann, E.J. Auerbach, G. Douaud, C.E. Sexton, E. Zsoldos, K. Ebmeier, N. Filippini, C.E. Mackay, S. Moeller, J.G. Xu, E. Yacoub, G. Baselli, K. Ugurbil, K.L. Miller, and S.M. Smith. ICA-based artefact removal and accelerated fMRI acquisition for improved resting state network imaging. NeuroImage, 95:232-47, 2014
Downloading and Installing
The latest version (1.06) can be run without MATLAB, using either the supplied precompiled-matlab binaries, or with Octave. The other change from v1.05 is a change in the top-level meta-classifier, which gives a tiny average improvement in classification accuracy. There is no need to rerun feature generation from v1.05 for use in v1.06, but the old trained-weights files cannot be used with v1.06 (and any custom trained-weights files will need regenerating).
Requirements:
- FSL
- MATLAB (if not using the pre-compiled binaries), with official toolboxes:
- Statistics
- Signal Processing
- R (>=3.3.0), with the following packages:
kernlab
0.9.24ROCR
1.0.7class
7.3.14party
1.0.25e1071
1.6.7randomForest
4.6.12
Setup FIX:
- Unpack FIX (1.06.15) with
tar xvfz fix.tar.gz
(ortar xvf fix.tar
if your browser has already uncompressed the file). - Download and install matlab compiled runtime (MCR) for your operating system (mac, linux)
- See the
README
file for further setup instructions
Running FIX
Simple usage, assuming training data already exists:
To run use the script fix
in the FIX directory, e.g.:
You need to feed in a full "first-level" (single-session) output directory (<mel.ica>
) created by the MELODIC or FEAT GUIs, with full registration run, including using a structural. If using FEAT, you need to have had ICA turned on in the Prestats tab. For the single-subject ICA you should in general use MELODIC's automatic dimensionality estimation (which creates the sub-folder <mel.ica>/filtered_func_data.ica
). The 20 refers to the thresholding of good vs bad components; sensible values are generally in the range of 5-20. However, if it is very important to you that almost no good components are removed, and hence you would prefer to leave in the data a larger number of bad components, then use a low threshold (e.g., in the range 1-5). It is strongly recommended that you look at the ICA components yourself to check at least a few of your subjects' classifications - look in the file called something like fix4melview_Standard_thr20.txt
- the final line lists the components that are considered as noise to be removed (with counting starting at 1 not 0). When running fix as shown above, you will end up with a cleaned version of the 4D preprocessed FMRI data: filtered_func_data_clean.nii.gz
.
If you have a compute cluster you can send the whole command to the cluster by preceding it with something like fsl_sub -q long.q ...
. However, if you are using fix to train the classifier and run leave-one-out testing (see below), we recommend that you run fix locally, if your local computer is able to submit jobs to your cluster, as it will do this for you, parallelising the LOO, and greatly speeding it up).
Usage for each stage separately:
The command described above is equivalent to run the following 3 steps:
- Extract features (for later training and/or classifying).
- Classify ICA components using a specific training dataset (
<thresh>
is in the range 0-100, typically 5-20).
- Apply cleanup, using artefacts listed in the
.txt
file, to the data inside the enclosing Feat/Melodic directory. This text file can be the output from the step above or can be created manually, in case you want to manually remove the artefactual components. In the second case make sure that the.txt
file contains a single line (or, at least, should have as its final line) with a list of the bad components only, with the format (for example):[1, 4, 99, ... 140]
- note that the square brackets, and use of commas, is required. Also, make sure there is an empty line at the end (i.e. hit return after writing the list). Counting starts at 1, not 0.
-m
: optionally also cleanup motion confounds (24 regressors), with highpass filtering of motion confounds controlled by:- if
-h
is omitted, fix will look to see if adesign.fsf
file is present, to find the highpass cutoff. - if-h
is omitted, and nodesign.fsf
is present, no filtering of the motion confounds will take place. - if
-h <highpass>
is set, then:-h -1
apply no filtering to motion confounds.-h 0
apply linear detrending only.-h <highpass>
with a positive<highpass>
value, apply highpass with<highpass>
being full-width (2*sigma) in seconds.
-A
: apply aggressive (full variance) cleanup, instead of the default less-aggressive (unique variance) cleanup.
Training datasets
Trained-weights files
FIX needs to be trained from multiple datasets that have already had the ICA components classified into "good" and "bad" by hand. We have hand-trained a few different types of data, and the trained-weights files from these are supplied with FIX. If you want to train FIX yourself (which in general is recommended), to better optimise it for the kind of data you have, you will need to do this hand classification yourself (at least 10 subjects). Alternatively, you can use one of the trained-weights *.RData
files supplied with FIX.
There are currently several trained-weights files supplied:
Standard.RData
- for use on more "standard" FMRI datasets / analyses; e.g., TR=3s, Resolution=3.5x3.5x3.5mm, Session=6mins, default FEAT preprocessing (including default spatial smoothing).HCP_hp2000.RData
for use on "minimally-preprocessed" 3T HCP-like datasets, e.g., TR=0.7s, Resolution=2x2x2mm, Session=15mins, no spatial smoothing, minimal (2000s FWHM) highpass temporal filtering.HCP7T_hp2000.RData
for use on "minimally-preprocessed" 7T HCP-like datasets, e.g., TR=1.0s, Resolution=1.6x1.6x1.6mm, Session=15mins, no spatial smoothing, minimal (2000s FWHM) highpass temporal filtering.WhII_MB6.RData
derived from the Whitehall imaging study, using multiband x6 EPI acceleration: TR=1.3s, Resolution=2x2x2mm, Session=10mins, no spatial smoothing, 100s FWHM highpass temporal filtering.WhII_Standard.RData
derived from more traditional early parallel scanning in the Whitehall imaging study, using no EPI acceleration: TR=3s, Resolution=3x3x3mm, Session=10mins, no spatial smoothing, 100s FWHM highpass temporal filtering.UKBiobank.RData
derived from fairly HCP-like scanning in the UK Biobank imaging study: 40 subjects, TR=0.735s, Resolution=2.4x2.4x2.4mm, Session=6mins, no spatial smoothing, 100s FWHM highpass temporal filtering.
You can find example training-input data, including our hand-labellings, here (note that you do not need this example training-input data in order to run FIX; you just need the trained-weights files included in the FIX directory).
How to create and use a new trained-weights file
To do your own training, for each FEAT/MELODIC output directory, you will need to create a hand_labels_noise.txt
file in the output directory. This text file should contain a single line (or, at least, should have as its final line), a list of the bad components only, with the format (for example): [1, 4, 99, ... 140]
- note that the square brackets, and use of commas, is required. Counting starts at 1, not 0. Once you have created all of the hand label files, you can then train the classifier (creating the trained-weights file <Training>.RData
) using the -t
option:
If you include the -l
option after the trained-weights output filename, a full leave-one-out test will be run; the results file that gets created at the end has a set of numbers at the end of it that tell you the true-positive-rate (TPR, proportion of "good" components correctly labelled) and the true-negative-rate (TNR, proportion of "bad" components correctly labelled) for a wide range of thresholds (see higher up in the output file for the list of thresholds tested).
The output from this command are:
Training.RData
- the your new trained-weights file to be used for subsequent classificationTraining
- a folder with a copy of the labels and the features of the subjects used to build the training datasetTraning_LOO
- a folder containing the intermediate files for the leave-one-out test (if you used the-l
option)Traning_LOO_results
- a file with the results of the leave-one-out test (if you used the-l
option)
You can now use your new trained-weights file to classify components in new datasets and then run the cleanup on the new data (see above):
/usr/local/fix/fix -c <Melodic-output.ica> <Training.RData> <thresh>
/usr/local/fix/fix -a <mel.ica/fix4melview_TRAIN_thr.txt> [-m [-h <highpass>]] [-A] [-x <confound>] [-x <confound2>]
If you want to test the accuracy of an existing training dataset on a set of hand-labelled subjects (e.g. to test whether an existing trained-weights file is suitable to be used for your study or if it’s better to create a new one), you can run the following command:
which classifies the components for all listed Melodic directories over a range of thresholds and produce LOO-style accuracy testing using existing hand classifications. Every Melodic directory must contain hand_labels_noise.txt
listing the artefact components, e.g.: [1, 4, 99, ... 140]
.
Input files required - in more detail
If you haven't done the full GUI-based MELODIC/FEAT analysis, you will need, in one directory:
filtered_func_data.nii.gz
- preprocessed 4D datafiltered_func_data.ica
-melodic
(command-line program) full output directorymc/prefiltered_func_data_mcf.par
- motion parameters created bymcflirt
(inmc
subdirectory)mask.nii.gz
- valid mask relating to the 4D datamean_func.nii.gz
- temporal mean of 4D datareg/example_func.nii.gz
- example image from 4D datareg/highres.nii.gz
- brain-extracted structuralreg/highres2example_func.mat
- FLIRT transform from structural to functional spacedesign.fsf
- FEAT/MELODIC setup file; if present, this controls the default temporal filtering of motion parameters
Download past versions of FIX
A selection of past versions of FIX are available at the links below.
- fix-1.06.14.tar.gz (MCR: macOS, linux)
- fix-1.06.13.tar.gz (MCR: macOS, linux)
- fix-1.06.12.tar.gz (MCR: macOS, linux)
- fix-1.06.10.tar.gz (MCR: macOS, linux)
- fix1.069.tar.gz
- fix1.068.tar.gz
- fix1.066.tar.gz
Pre-1.069 Changelog
- v1.061 has a tiny change from 1.06, in that it can work with the newest flavours of R that had started to create problems for 1.06. 1.061 can be used with features from 1.05-1.06 and training files from 1.06.
- v1.062 has a couple of minor changes to matlab code that means that near-rank-deficiency across the cleanup timeseries is more robustly handled.
- v1.063 and v1.064 have a couple of minor bugfixes in matlab code.
- v1.065 has a minor change to be compatible with an upcoming change in a future FSL release of smoothest (while still being compatible with older FSL versions).
- v1.066 adds an option to only process CIFTI data.
- v1.067 adds ability to write out variance normalisation factor images for use in HCP.
- v1.068 has some configuration file bugfixes and compiled matlab for MacOS and Linux.)
FAQ
When I run FIX, I obtain the following output: “No valid labelling file specified”. What does it mean?
FIX doesn’t find the classification file with the list of components to be removed, so the error could be either in the features extraction or in the classification. To see which is the problem have a look at the following log files:
<subject.ica>/fix/logMatlab.txt
(this should show errors in Matlab part, i.e. features extraction)<subject.ica>/.fix.log
<subject.ica>/.fix_2b_predict.log
(those are log file in general for the whole routine) You’ll probably find errors related to Matlab or R, so you might need to check yoursettings.sh
file following the setup instructions described in the FIX README file
How do I choose the best training dataset (among the existing ones) and/or threshold for my data?
FIX is more likely to work better with the training dataset that is most similar to your data, both in terms of acquisition parameters (TR and resolution) and preprocessing steps applied. Regarding the threshold to use, you can start with the “default” 20 and increase or decrease it according to FIX performance (i.e. visual check of the components' classification contained in the file fix4melview_TRAIN_thr.txt
). For example, if it is very important to you that almost no good components are removed, and hence you would prefer to leave in the data a larger number of bad components, then use a low threshold. If you want to remove more noise, use a higher threshold.
What is the difference between fsl_regfilt
and FIX?
FIX is an automated equivalent of fsl_regfilt
(they both perform non-aggressive (unique variance) cleanup by default), so you don’t need to run both:
fsl_regfilt
: manual classification of unwanted components + runfsl_regfilt
—> cleaned data- FIX: automated classification artefactual components and regression of their contribution out of the data —> cleaned data
To check that FIX is removing the artifactual components correctly (i.e. it is doing what you would do running fsl_regfilt
) you can check the classification done by FIX in the fix4melview...txt
file and adjust the training dataset and threshold you are using as appropriate.
Can I use FIX to clean task fMRI data?
Yes, although you will probably need to create a study-specific training dataset
When I run FIX to create a new training file (-t
), the output folder is created, but no .RData
is produced at the end, with no explicit error message. What does it mean?
Check the content of the folllowing hidden files within the output directory created:
.fixlist
--> should contain the list of subjects included in the training dataset (to check if they've been all loaded/recognised properly).Rlog1
--> contains errors from R about the generation of the.RData
file- Also, make sure that the
.txt
files (hand_labels_noise.txt
) are in the correct format: the last line should contain the list of the components only, within square brackets and comma separated, and there should be an empty line at the end (i.e. hit return after writing the list).