FIX - FMRIB's ICA-based Xnoiseifier

FIX is a tool which can be used to remove noise from 4D FMRI data. FIX is intended to be run on single-session MELODIC ICA output. FIX attempts to auto-classify ICA components into good vs bad components, so that the bad components can be removed from 4D FMRI data. See example raw data movies showing the (potentially huge) effect of FIX cleanup.

Note

FIX is installed as part of FSL 6.0.7.8 and newer. FIX was originally written in MATLAB, R and Bash, but has been re-written in Python to make it more widely available and easier to install. Details on the old R/MATLAB version can be found here.

For FIX to work well, it is very important that it is run using good training data. While a few example trained-weights files are supplied with FIX, for major studies we would strongly recommend training FIX on your own study data (see details in the User Guide section). You can find example training-input data, including our hand-labellings, here. Note that you do not need this example training-input data in order to run FIX - you can use one of the included trained-weights files, or (preferably) train FIX on your own data.

Referencing

If you use FIX in your research, please cite these papers:

G. Salimi-Khorshidi, G. Douaud, C.F. Beckmann, M.F. Glasser, L. Griffanti S.M. Smith. Automatic denoising of functional MRI data: Combining independent component analysis and hierarchical fusion of classifiers. NeuroImage, 90:449-68, 2014

L. Griffanti, G. Salimi-Khorshidi, C.F. Beckmann, E.J. Auerbach, G. Douaud, C.E. Sexton, E. Zsoldos, K. Ebmeier, N. Filippini, C.E. Mackay, S. Moeller, J.G. Xu, E. Yacoub, G. Baselli, K. Ugurbil, K.L. Miller, and S.M. Smith. ICA-based artefact removal and accelerated fMRI acquisition for improved resting state network imaging. NeuroImage, 95:232-47, 2014

Overview

FIX has the following stages of execution:

Feature extraction: Extracting information (features) from the 4D fMRI data which will be used for IC classification
Model training: Training a classifier on extracted features and hand-labelled data
IC classification: Using a pre-trained model to classify ICs as signal or noise.
Data cleaning: Regressing the time courses noise ICs from fMRI data to produce a "cleaned" version of the data, ready for subsequent analysis.

Using FIX with a pre-trained model

If you want to use FIX with one of the provided pre-trained models, or with an existing model of your own, you can run feature extraction, IC classification, and data cleaning with a single command:

fix <melodic.ica> <model> <threshold> [cleanup options]

For example:

if your MELODIC ICA directory is called rest.ica
you want to use the built-in UKBiobank classification model
you want to use a classification threshold of 20 (more on this below)
you also want to regress out motion confounds (more on this below)

You would run this command:

fix rest.ica UKBiobank 20 -m

Your input MELODIC directory (rest.ica above) must be a full "first-level" (single-session) output directory created by the MELODIC or FEAT GUIs, with full registration run, including using a structural image. If using FEAT, you need to have had ICA turned on in the Prestats tab. For the single-subject ICA you should in general use MELODIC's automatic dimensionality estimation (which creates the sub-folder <mel.ica>/filtered_func_data.ica).

The <threshold> (20 in the example above) refers to the thresholding of good vs bad components; sensible values are generally in the range of 5-20. However, if it is very important to you that almost no good components are removed, and hence you would prefer to leave in the data a larger number of bad components, then use a low threshold (e.g., in the range 1-5).

FIX will save the following files in your MELODIC directory (rest.ica in the example above):

fix4melview_<model>_thr<threshold>txt (e.g. fix4melview_UKBiobank_thr20.txt): A text file containing the IC classifications/labels. The final line in this file lists the components that are considered as noise to be removed (with counting starting at 1 not 0). You can load your melodic_IC.nii.gz file along with this label file into FSLeyes for interactive visualisation and assessment of the IC classification. It is strongly recommended that you look at the ICA components yourself to check at least a few of your subjects' classifications.
fix: A directory containing the extracted features that are used for IC classification.
filtered_func_data_clean.nii.gz: The cleaned version of the 4D preprocessed FMRI data, after regression of the noise component time courses.
filtered_func_data_clean_vn.nii.gz: A 4D file containing the standard deviation of the unstructured/gaussian noise which remains in the cleaned FMRI data, after regression of the noise components. This file is generated by regressing the time courses of the signal components from the cleaned FMRI data.

Running each stage separately

The command described above is equivalent to running the following three steps:

Fetaure extraction: extract features for later training/classification:
```
fix -f <mel.ica>
```
IC classification: classify ICA components using a specific trained FIX model (<threshold> is in the range 0-100, typically 5-20):
```
fix -c <mel.ica> <model> <threshold>
```
Data cleaning: apply cleanup, regressing out the time series of noise ICs listed in the labels.txt file, to the data inside the enclosing FEAT/MELODIC directory. This text file can be the output from the IC classification step above, or can be created manually, either by hand, or using FSLeyes for manual IC classification. You can find more information on the FIX label file format in the FSLeyes documentation.
```
fix -a <labels.txt> [<mel.ica>] [-m [-h <highpass>]] [-A]
```

If the <mel.ica> directory is not specified, FIX assumes that <labels.txt> is contained within the <mel.ica> directory that contains the data to be cleaned.

You have some options for more fine-grained control over the clean up. - -m Optionally also cleanup motion confounds, with highpass filtering of motion confounds controlled by -h: - if -h is omitted, fix will look to see if a FEAT design.fsf file is present, to find the highpass cutoff. - if -h is omitted, and no design.fsf is present, no filtering of the motion confounds will take place. - -h -1: apply no highpass filtering - -h 0: apply linear detrending only - -h <highpass> with a positive <highpass> value, apply highpass with <highpass> being full-width (\(2*sigma\)) in seconds. - -A: apply aggressive (full variance) cleanup, instead of the default less-aggressive (unique variance) cleanup.

Pre-trained models

FIX needs to be trained from multiple datasets that have already had the ICA components classified by hand as being either good (signal) or bad (noise). FIX includes pre-trained models for a few different types of data which you can use on your own data. However, If you want to train FIX yourself (which in general is strongly recommended), to better optimise it for the kind of data you have, you will need to do this hand classification yourself for at least 10 of your subjects. You can perform this hand-classification from within FSLeyes - refer to the FSLeyes documentation for more details.

You may be able to use one of the pre-trained models that are supplied with FIX:

Standard: for use on more "standard" FMRI datasets / analyses; e.g., TR=3s, Resolution=3.5x3.5x3.5mm, Session=6mins, default FEAT preprocessing (including default spatial smoothing).
HCP25_hp2000 for use on "minimally-preprocessed" 3T HCP-like datasets, e.g., TR=0.7s, Resolution=2x2x2mm, Session=15mins, no spatial smoothing, minimal (2000s FWHM) highpass temporal filtering.
HCP7T_hp2000 for use on "minimally-preprocessed" 7T HCP-like datasets, e.g., TR=1.0s, Resolution=1.6x1.6x1.6mm, Session=15mins, no spatial smoothing, minimal (2000s FWHM) highpass temporal filtering.
HCP_Style_Single_Multirun_Dedrift derived from task and resting state fMRI data from 75 HCP young adults 3T data sets.
WhII_MB6 derived from the Whitehall imaging study, using multiband x6 EPI acceleration: TR=1.3s, Resolution=2x2x2mm, Session=10mins, no spatial smoothing, 100s FWHM highpass temporal filtering.
WhII_Standard derived from more traditional early parallel scanning in the Whitehall imaging study, using no EPI acceleration: TR=3s, Resolution=3x3x3mm, Session=10mins, no spatial smoothing, 100s FWHM highpass temporal filtering.
UKBiobank derived from fairly HCP-like scanning in the UK Biobank imaging study: 40 subjects, TR=0.735s, Resolution=2.4x2.4x2.4mm, Session=6mins, no spatial smoothing, 100s FWHM highpass temporal filtering.
NHP_HCP_Macaque: derived from NHP-HCP Macaque data, using multiband x5 EPI acceleration: TR=0.76s, Resolution:1.25x1.25x1.25mm, Session=102min, no spatial smoothing, minimal (2000s FWHM) highpass temporal filtering.
NHP_HCP_MacaqueCyno: derived from NHP-HCP Cynomolgus Macaque data.

You can find example training-input data, including our hand-labellings, here (note that you do not need this example training-input data in order to run FIX; you just need the trained-weights files included in the FIX directory).

Training your own FIX model

To do your own training, for each FEAT/MELODIC output directory, you will need to create a hand_labels_noise.txt file in the output directory. This text file should contain a single line (or, at least, should have as its final line), a list of the bad components only, with the format (for example): [1, 4, 99, ... 140] - see the FSLeyes documentation for more details on the required format). You also need to perform feature extraction on each of your training data sets.

Once you have created all of the hand label files and performed feature extraction, you can then train FIX using the -t option:

fix -t mymodel [-l] <mel1.ica> <mel2.ica> <mel3.ica> ...

FIX will save the model to a file mymodel.pyfix_model.

If you include the -l option after the model output filename, a full leave-one-out (LOO) test will be performed. The results will be saved to a file called mymodel_LOO_results; this file has a set of numbers at the end of it that tell you the true-positive-rate (TPR, proportion of "good" components correctly labelled) and the true-negative-rate (TNR, proportion of "bad" components correctly labelled) for a wide range of thresholds (see higher up in the output file for the list of thresholds tested).

The output from this command are:

mymodel.pyfix_model - Your FIX model, which can be used for subsequent classification
mymodel_LOO_results - a text file with the results of the leave-one-out test (if you used the -l option)
mymodel.file_table - a text file containing an index of all files used for training, used internally by FIX.

You can now use your new model file to classify components in new datasets and then run the cleanup on the new data, (outlined above). For example:

fix rest.ica mymodel.pyfix_model 20

Or (equivalently)

fix -f rest.ica
fix -c rest.ica mymodel.pyfix_model 20
fix -a rest.ica/fix4melvew_mymodel_thr20.txt

If you want to test the accuracy of an existing training dataset on a set of hand-labelled subjects (e.g. to test whether an existing trained-weights file is suitable to be used for your study or if it’s better to create a new one), you can run the following command:

fix -C mymodel.pyfix_model <output> <mel1.ica> <mel2.ica> ...

which classifies the components for all listed MELODIC directories over a range of thresholds and produce LOO-style accuracy testing using existing hand classifications. Every MELODIC directory must contain hand_labels_noise.txt files listing the artefact components, e.g.: [1, 4, 99, ... 140].

Training FIX on MRI data from different species

By default, FIX assumes that your data has been registered to the MNI152 template, which is the default option when using the FEAT or MELODIC GUIs to analyse your data. FIX uses a built-in set of mask images, aligned to the MNI152 template, to extract a set of spatial features.

FIX has some other built-in sets of mask images which can be used if you are extracting features from data that has been aligned to a different template (e.g. non-human MRI). Currently, FIX has two sets of mask images for use with macaque MRI data from the NHP-HCP project. These mask images have been kindly contributed by Takuya Hayashi, of the Riken Center for Biosystems Dynamics Research.

If you are performing feature extraction for use with the pre-trained NHP_HCP_Macaque model, you should specify --species macaque, e.g.:

fix -f --species macaque rest.ica

Similarly, if you are performing feature extraction for use with the pre-trained NHP_HCP_MacaqueCyno model, you should specify --species macaque_cyno, e.g.:

fix -f --species macaque_cyno rest.ica

The --species option is not necessary when performing feature extraction, classification and clean-up in one step, as the species idenfitier is saved in the FIX model file. In this case, you just need to give the name of the pre-trained model, for example:

fix rest.ica NHP_HCP_Macaque 20

Converting an old MATLAB/R FIX `.RData` model to a new `.pyfix_model` file

R is required

Note that you will need to have R installed and available in your shell environment (specifically the Rscript command) in order to convert an old FIX .RData model to a .pyfix_model file.

If you have already created a model with the old MATLAB/R version of FIX, you can convert it for use with the new Python-based version by using the convert_fix_model command. For example, if your model file is named mymodel.RData, you can run this command:

convert_fix_model mymodel.RData mymodel.pyfix_model

If you would like a copy of the features and hand labels that were originally used to create the .RData model, you can use the -d option, for example:

convert_fix_model -d ./mymodel-data mymodel.RData mymodel.pyfix_model

This will cause a directory ./mymodel-data to be created; within this directory, a set of .ica directories will be created, with each containing the features and hand labels for one subject.

Acknowledgments

The data and hand labels for the built-in pyFIX models have been contributed by a large number of people, including:

Gholamreza Salimi Khorshidi
Ludovica Griffanti
Gwenaelle Douaud
David Flitney
Claire Sexton
Eniko Zsoldos
Klaus Ebmeier
Nicola Filippini
Clare Mackay
Stephen Smith (Oxford)
Matthew Glasser
Donna Dierker
Erin Reid
David Van Essen (WashU)
Edward Auerbach
Steen Moeller
Junqian Xu
Essa Yacoub,
Kamil Ugurbil (Minnesota)
Giuseppe Baselli (Milan)
Christian Beckmann (Donders)
Takuya Hayashi (Riken Center)