FMRIB configuration profile
FUNPACK comes with a built-in configuration profile (the “FMRIB” profile), containing a range of processing rules for a large number of UK BioBank data fields. These rules can be applied by running:
fmrib_unpack -cfg fmrib <output.tsv> <input.csv>
The fmrib
configuration profile is installed alongside the FUNPACK source
code - it can be viewed online here,
or found in your local FUNPACK installation within
<python-env>/lib/python<X.Y>/site-packages/funpack/configs/
(replacing
<python-env>
with the location of your Python environment, and <X.Y>
with the Python version).
Note
The fmrib
configuration profile is managed independently from
the FUNPACK source code at
https://git.fmrib.ox.ac.uk/fsl/funpack-fmrib-config/, but is always
installed alongside FUNPACK.
The fmrib
configuration profile is split across several files, each of
which are described below. Click on the arrow to the left of each section to
view the contents of that file.
funpack/configs/fmrib.cfg
: Top-level configuration file,
containing general settings, and references to the other
configuration files.#
# FUNPACK "fmrib.cfg" configuration file
#
# Provides FMRIB-recommended processing of UKB tables
#
# Record config version number in FUNPACK logging
echo "FUNPACK FMRIB configuration version: 1.7.1"
# Use local settings
config_file local
# Contains some FMRIB-specific plugin functions,
# including date/time normalisation.
plugin_file fmrib
# Drop non-numeric columns - the main output
# file only contains numeric data.
suppress_non_numerics
# Auto-import auxillary variables which are
# used in processing steps, and which would
# otherwise not be imported.
add_aux_vars
# Only import variables from FMRIB-curated categories,
# largely drawn from showcase categories
category_file fmrib/categories.tsv
#
# FUNPACK cleaning/processing stages
#
# - NA insertion
# - Categorical recoding
# - Cleaning functions (e.g. replacing ICD codes with numeric equivalents)
# - Child value replacement
# - "Processing" (e.g. one-hot encoding, redundancy check)
#
# Each is activated or deactivated by certain flags.
#
# - NA insertion
# Specify with `-vf` or `-df`, table listing "NAValues"
# Suppress with `-sn` option
# - Categorical recoding
# Specify with `-vf` or `-df`, table listing "RawLevels" and "NewLevels"
# Suppress with `-sr` option
# - Cleaning
# Specify with `-vf`, table listing cleaning functions in "Clean"
# Suppress with `-scf`
# - Child value replacement
# Specify with `-vf`, table listing "ParentValues" and "ChildValues"
# Suppress with `-scv`
# - Processing
# Specify with `-pf`, file listing variables and processing functions
# Suppress with `-sp`
# - NA insertion
datacoding_file fmrib/datacodings_navalues.tsv
# - Categorical recoding
datacoding_file fmrib/datacodings_recoding.tsv
# - Cleaning
variable_file fmrib/variables_clean.tsv
# Date/timestamp normalisation (performed in the FUNPACK cleaning stage)
# Converts a date or date+time into a single value x, where floor(x) is the
# calendar year and the fraction day/time within the year *except* 'a day'
# is redefined as the time between 7am and 8pm (scanning only takes place
# within these hours.
type_file fmrib/datetime_formatting.tsv
# - Child value replacement
variable_file fmrib/variables_parentvalues.tsv
# - Processing -
processing_file fmrib/processing.tsv
funpack/configs/local.cfg
: Included by fmrib.cfg
.
Contains some miscellaneous settings related to performance
and error-checking.#
# FUNPACK "local.cfg" configuration file
#
# Provides local configuration that likely all users will want to use
#
# Overwrite files
overwrite
# Number of cores to use in parallelisation; -1 for 'all possible'
num_jobs -1
# It is always recommended to trust data types of each column; without this option
# processing is considerably slower. Disable when debugging a possibly prolematic file.
trust_types
funpack/configs/fmrib/categories.tsv
: Definition of FMRIB
datafield categories - groups of related data-fields.
Categories can be selected with the -c/--category
option.ID Category Variables
1 ID, age, sex, brain MRI protocol Phase 0,31,33,34,52:55,21022,22200,25780
2 genetics 21000,22000:22031,22041:22125,22190:22194,22201:22325,22182,22800:22823,26200:26290
3 early life factors 129,130,1677,1687,1697,1737,1767,1777,1787,21066,20022
10 lifestyle and environment - general 3:6,132,189,670,680,699,709,728,738,767,777,1031,1797,1807,1835,1845,1873,1883,2139,2149,2159,2237,2375,2385,2395,2405,2267,2277,2714:10:2834,2946,3526,3536,3546,3581,3591,3659,3669,3700,3710,3720,3829,3839,3849,3872,3882,3912,3942,3972,3982,4501,4674,4825,4836,5057,6138,6142,6139:6141,6145:6146,6160,10016,10105,10114,10721,10722,10740,10749,10860,10877,10886,20074:20075,20107,20110:20113,20118:20119,20121,22189,22501,22599,22606,22700,22702,22704,24003:24024,24500:24508,26410:26434
11 lifestyle and environment - exercise and work 1001,1011,796,806,816,826,845,864,874,884,894,904,914,924,943,971,981,991,1021,1050:10:1220,2624,2634,3426,3637,3647,6143,6162,6164,10953,10962,10971,20277,20614,20656,20657,20668,20669,20670,20733,20741,20749,22604,22605,22607:22615,22620,22630,22631,22640:22655,104900,104910,104920
12 lifestyle and environment - food and drink 1289:10:1389,1408:10:1548,2654,3089,3680,6144,10007,10723,10767,10776,10855,10912,20084:20094,20098:20106,20108:20109,20600:20613,20615:20616,20618:20640,20642:20655,20658:20667,20671:20681,20683:20708,20710:20728,20730:20732,20734:20740,20743:20748,21100:21101,26000:26155,100001:100009,100011:100019,100021:100025,100010:10:100560,100760:10:104670
13 lifestyle and environment - alcohol 1558:10:1628,2664,3731,3859,4407,4418,4429,4440,4451,4462,5364,10818,20095:20097,20117,20403:20410,20414:20416,20617,20682,20709,20729,20742,100580:10:100740
14 lifestyle and environment - tobacco 1239:10:1279,2644,2867:10:2907,2926,2936,3159,3436:10:3506,5959,6157,6158,6183,6194,10115,10827,10895,20116,20160:20162,20641,22506:22508
20 physical measures - general 46:51,1707,1717,1727,1747,1757,2306,3059,3062:3065,3088,3160,10691,10693:10696,10714,10717,12143:12144,20015,20255:20258,21001,21002,21110:21111,21114:21118,21120:21121,21123:21124,21127:21130,21132:21133,21135,22400:22414,22427,23098:23130,23244:23289,23355:23364,24352:24354
21 physical measures - bone density and sizes 77,78,3083:3086,3143:3144,3146:3148,4092,4095,4100:4101,4103:4106,4119:4120,4122:4125,4138:4147,21005,21112:21113,21119,21122,21125:21126,21131,21134,21136:21138,23200:23243,23290:23320,23325:23344
22 physical measures - cardiac & blood vessels 93:95,102,4079,4080,4136,4194:4196,4198:4200,4204:4205,4207,5983,5984,5986,5992,5993,6014:6017,6019,6020,6022,6024,6032:6034,6038,6039,12673:12687,12336,12338,12340,12697,12698,12702,21021,22330:22338,22420:22426,22670:22685,24100:24181
23 hearing test 4229:4230,4232:4237,4239:4247,4249,4268:4270,4272,4275:4277,4279,4849,10793,20019,20021,20060
24 eye test 5076:5079,5082:5091,5096:5119,5132:5136,5138:5149,5152,5155:5164,5181:5183,5186,5188,5190,5193,5198:5199,5201,5202,5204,5206,5208,5209,5211,5215,5221,5237,5251,5254:5259,5261:5267,5273,5274,5276,5292,5306,5324:5328,6070:6075,7495:7539,7541:7542,7544,20055,20057,20261:20262,27800:27841,27851:27858,28500:28553
25 physical activity measures 5985,22032:22040,90002:90003,90010:90013,90015:90177,90179:90195
26 abdominal measures 21080:21093,21160:21163,22415:22417,22432:22436
30 blood assays 74,23000:23044,23049:23060,23062,23063,23065:23071,23073:23075,23165,23400:23648,30000:10:30300,30314:10:30344,30364:10:30424,30500:10:30530,30600:10:30890
31 brain IDPs 24360:24409,24418:24486,25000:25746,25754:25759,25761:25768,25781:25930,26500:27772
32 cognitive phenotypes 62,111,396:404,630,4250:4256,4258:4260,4281:4283,4285,4287,4290:4292,4294,4924,4935,4957,4968,4979,4990,5001,5012,5556,5699,5779,5790,5866,6312:6313,6325,6332,6333,6348:6351,6362,6364,6365,6373,6374,6382,6383,6770:6773,7676:7678,10133:10134,10136:10144,10146:10147,10241,10609:10610,10612,20016,20018,20023,20082,20128:20157,20159,20165,20167,20169:2:20197,20192,20196:2:20200,20229,20230,20240,20242,20244:20248,21004,23045:23047,23072,23076:23079,23321:23324,26302,26306
50 health and medical history, health outcomes 84,87,92,134:137,2178,2188,2207,2217,2227,2247,2257,2296,2316,2335:10:2365,2415,2443:10:2473,2492,2674,2684,2694,2704,2844,2956:10:2986,3005,3079,3140,3393,3404,3414,3571,3606,3616,3627,3741,3751,3761,3773,3786,3799,3809,3894,3992,4012,4022,4041,4056,4067,4689,4700,4717,4728,4792,4803,4814,5408,5419,5430,5441,5452,5463,5474,5485,5496,5507,5518,5529,5540,5610,5832,5843,5855,5877,5890,5901,5912,5923,5934,5945,6119,6147,6148,6149,6150,6151,6152,6153,6154,6155,6159,6177,6179,6205,6671,10004:10006,10854,20001:20011,20199,21024:21045,21047:21061,21064:21065,21067,21068,21070:21076,22126:22181,22502:22505,22616,22618,22619,27981:27982,27989:27993,28000,28011:28029,28140,28147:28165,28600:28756,29150:29153,29155:29161,40001:41000,41002:41253,41256,41258,41266,41267,41269:41273,41275:41278,41284:41286,42000:42013
51 mental health self-report 1920:10:2110,4526,4537,4548,4559,4570,4581,4598,4609,4620,4631,4642,4653,5375,5386,5663,5674,6156,20122,20126:20127,20401,20411,20413,20417:20423,20425:20429,20431:20442,20445:20450,20453:20463,20465:20467,20470:20471,20473,20476,20477,20479:20484,20485:20544,20546:20554,21062:21063,29000:29149,29154,29162:29182
52 experience of pain 120000:120127
60 health dates 41257,41260,41262,41263,41268,41280:41283,42014:2:42020,42026,42030,42032,130004,130008,130014:2:130022,130060:2:130064,130070,130082,130104,130106,130118,130134,130148,130174:2:130178,130184:2:130190,130194:2:130200,130202,130212,130216,130218,130224:2:130230,130254,130264,130310,130320,130336:2:130344,130622:2:130626,130632,130634,130642,130646,130648,130656:2:130660,130664,130666,130670,130686,130688,130696:2:130708,130714,130718,130722:2:130726,130734:2:130738,130770,130774,130784,130792,130804,130814:2:130820,130826:2:130832,130836:2:130842,130846,130848,130852,130854,130868,130874,130890:2:130898,130902:2:130910,130914:2:130924,130932,130944,130998,131000,131016,131022,131030,131032,131036,131038,131042,131046,131048,131052:2:131056,131060:2:131066,131070:2:131078,131082:2:131092,131102:2:131110,131114,131118,131124:2:131132,131136,131138,131142,131144,131148:2:131154,131158,131160,131164:2:131168,131174,131178:2:131186,131190,131192,131196,131198,131202,131204,131208:2:131216,131220:2:131224,131228,131230,131234,131236,131242:2:131246,131250,131252,131256:2:131264,131270,131280,131282,131286,131290,131296:2:131300,131304:2:131310,131314,131316,131322,131324,131328,131330,131338,131342:2:131356,131360:2:131370,131374,131378:2:131392,131396,131400:2:131416,131422:2:131432,131436,131440:2:131446,131450,131456,131458,131462:2:131484,131488:2:131494,131498,131502,131518,131524,131528,131534,131538,131540,131546,131548,131554,131556,131560:2:131586,131590:2:131594,131598:2:131604,131608:2:131620,131624:2:131654,131658,131662,131666:2:131670,131674:2:131684,131688:2:131692,131698:2:131708,131716,131720,131722,131726:2:131730,131734:2:131742,131746,131748,131754,131760,131766,131768,131774,131778,131782,131788:2:131798,131802:2:131806,131810,131812,131820:2:131826,131830,131834,131836,131840,131848:2:131852,131858:2:131864,131868:2:131888,131892,131894,131898,131900,131904,131906,131910:2:131914,131916,131918,131922:2:131930,131934,131938:2:131942,131946:2:131950,131954:2:131964,131970:2:131982,131986:2:131996,132002,132004,132008,132014,132016,132020,132022,132030:2:132038,132042,132050,132054:2:132058,132062:2:132066,132070:2:132078,132082:2:132092,132096:2:132112,132116,132118,132122,132124,132128:2:132152,132156,132160:2:132170,132174,132186,132188,132192:2:132196,132202,132206,132212,132216,132220:2:132224,132228:2:132232,132238:2:132244,132248:2:132252,132256,132260:2:132264,132268,132270,132274:2:132280,132288,132298,132312,132468,132472,132500,132510,132518,132522,132532,132536,132542,132562,132574
70 health sources 42015:2:42021,42027,42031,42033,130005,130009,130015:2:130023,130061,130063,130065,130071,130083,130105,130107,130119,130135,130149,130175:2:130179,130185:2:130191,130195:2:130201,130203,130213,130217,130219,130225:2:130231,130255,130265,130311,130321,130337,130341:2:130345,130623:2:130627,130633,130635,130643,130647,130649,130657:2:130661,130665,130667,130671,130687,130689,130697:2:130709,130715,130719,130723:2:130727,130735,130737,130739,130771,130775,130785,130793,130805,130815:2:130821,130827:2:130833,130839,130843,130847,130849,130853,130855,130869,130875,130891:2:130899,130903:2:130911,130915:2:130925,130933,130945,130999,131001,131017,131023,131031,131033,131037,131039,131043,131047,131049,131053,131055,131057,131061:2:131067,131071:2:131079,131083:2:131093,131103:2:131111,131115,131119,131125,131129,131131,131133,131137,131139,131143,131145,131149:2:131155,131159,131161,131165,131167,131175,131179:2:131187,131191,131193,131197,131199,131203,131205,131209:2:131217,131223,131225,131229,131231,131237,131243:2:131247,131251,131253,131257:2:131265,131271,131281,131283,131287,131291,131297,131299,131305,131307,131309,131311,131315,131317,131323,131325,131329,131331,131339,131343:2:131357,131361:2:131371,131375,131379:2:131393,131397,131401,131403,131407:2:131417,131423:2:131433,131437,131441:2:131447,131451,131457,131459,131463:2:131485,131489:2:131495,131499,131503,131519,131525,131529,131535,131539,131541,131547,131549,131555,131557,131561,131563,131565:2:131587,131591,131593,131595,131599:2:131605,131609:2:131621,131625:2:131655,131659,131663,131667:2:131671,131675:2:131685,131689:2:131693,131699:2:131709,131717,131721,131723,131727:2:131731,131735:2:131743,131747,131749,131755,131761,131767,131769,131775,131779,131783,131791:2:131799,131803,131805,131807,131811,131813,131821:2:131827,131831,131835,131837,131841,131849,131851,131859:2:131865,131869:2:131889,131893,131895,131899,131901,131905,131907,131911,131913:2:131919,131923:2:131931,131935,131939,131941,131943,131947,131949,131951,131955:2:131965,131971:2:131981,131987:2:131997,132003,132005,132009,132015,132017,132021,132023,132031:2:132039,132043,132051,132055,132057,132059,132063:2:132067,132071:2:132079,132083:2:132093,132097:2:132109,132111,132113,132117,132119,132123,132125,132129:2:132153,132157,132161:2:132171,132175,132187,132189,132193:2:132197,132203,132207,132213,132217,132221:2:132225,132245,132265,132269,132271,132275:2:132281,132289,132299,132469,132473,132501,132511,132519,132523,132533,132537,132543,132563,132575,132313
97 COVID misc, ignored 27983:27986,28001,28005,28006,28008:28010,28030,28032:28127,28141:28143,28146,28166,28167,28174,41001
98 pending, to sort out categories later 41259,41261,41264,42038:42040
99 misc, ignored 19,21,35:45,68,96,120,200,393,757,1647,2129,3060,3061,3066,3077,3081:3082,3090,3132,3137,3166,4081,4093,4096,4186,4206,4238,4248,4257,4286,4288:4289,4293,4295,5074,5075,5080,5081,5214,5253,5268,5270,5987:5988,5991,6023,6025,6314,6315,6317,6334,6448,6459,6470,6481,6492,6503,6514,6525,6536,6547,7381:7410,7669:7674,10145,10697,12139:12141,12148,12187,12188,12223,12224,12253,12254,12291,12323,12623,12624,12651:12654,12658,12663,12664,12671,12688,12695,12699,12700,12704,12706,12848,12851,12854,20012:20014,20024:20025,20031:20032,20035,20041:20054,20058:20059,20061:20062,20071,20072,20077:20081,20083,20114:20115,20158,20201:20227,20241,20249:20254,20259,20260,20400,20599,20750,20751,21003,21011:21018,21023,21069,21094,21611,21621,21622,21625,21631,21634,21636,21642,21651,21661:21666,21671,21711,21721:21723,21725,21731:21734,21736,21738,21741,21742,21751,21761:21766,21771,21821:21823,21825,21831:21834,21836,21838,21841:21842,21851,21861:21866,21871,22499,22500,22600:22603,22617,22660:22664,23048,23160:23164,23649,23650,23658:23660,23762,23772,23774,23775,25747:25753,27980,28003,29183:29209,30001:10:30301,30002:10:30302,30003:10:30303,30004:10:30304,30354,30502:10:30522,30532,30601:10:30891,30602:10:30742,30762:10:30892,30605:10:30895,30666,30796,30806,30826,30856,30897,30900:30902,40000,41289,41290,90001,90004,105010,105030,110001,110002,110005,110006,110008,120128
100 exclude 20243,20263:20267,21811,31000:31028
funpack/configs/fmrib/datacodings_navalues.tsv
: NA value
replacement rules. The ID
column specifies the UKB
data-coding, and the NAValues
column is a comma-separated
list of values which will be removed. Each rule is applied to
every data-field that uses the corresponding data-coding. Only
the first 20 rules are shown here.ID NAValues
13 -1,-3
14 -1,-3
17 -1
37 -1,-3
42 -313,-818
90 -3
96 -121,-818
170 -1
214 9999
218 99
221 9,X
222 99
272 1900-01-01
339 -818
402 0
408 -818
439 1900-01-01
480 -818
485 -1
486 -121,-818
funpack/configs/fmrib/datacodings_recoding.tsv
:
Categorical recoding rules. The ID
column specifies the
UKB data-coding, the RawLevels
column is a comma-separated
list of values to be replaced, and the NewLevels
column is
a comma-separated list of values to replace the RawLevels
with. Each rule is applied to every data-field that uses the
corresponding data-coding. Only the first 20 rules are shown
here.ID RawLevels NewLevels
16 -1,0,1,2,3 0,0,0,0,1
18 1,2 2,1
29 9 0
487 -1 31
488 -1001 0.5
489 9,0,1 0,1,2
493 -131,-141 1,2
494 -1520,-2030,-3040,4000 1,2,3,4
496 111,112,113,114 3,2,1,0
511 -999 max()+1
517 -999 max()+1
528 -999 max()+1
530 -999 max(min()-1, 0)
537 1,2,3,4,5,6 6,5,4,3,2,1
946 -1001 0.5
950 -500,-501,-502,-503,-504 0,1,2,3,4
957 -777 52
1018 -600,-601,-602 0,1,2
1231 -7,2,1 0,1,2
1990 0 max()+1
funpack/configs/fmrib/variables_clean.tsv
: Custom cleaning
functions. The ID
column specifies the UKB data-field, and
the Clean
column specifies the cleaning function to apply
(see here for an overview of all
built-in cleaning functions). Only the first 20 rules are
shown here.ID Clean
3066 parseSpirometryData
10697 parseSpirometryData
funpack/configs/fmrib/datetime_formatting.tsv
: Output
format for all date/time data-fields. These functions are
defined in the built-in funpack.plugins.fmrib
plugin
module.Type Clean
date normalisedDate
time normalisedAcquisitionTime
funpack/configs/fmrib/variables_parentvalues.tsv
: Child
value replacement rules applied to data fields. The ID
column specifies the UKB data-fiuld, the ParentValues
column is an exprssion which is evaluated on the parent
data-fields, and the ChildValues
column is a value to set
the data-field to when the expression evaluates to true. Only
the first 20 rules are shown here.ID ParentValues ChildValues
757 v6142 == 1 0
767 v6142 == 1 0
777 v6142 == 1 0
796 v6142 == 1 || v777 == 0 0
806 v6142 == 1 0
816 v6142 == 1 0
826 v6142 == 1 0
874 v864 >= 1 0
894 v884 >= 1 0
914 v904 >= 1 0
924 v864 >= 1 0
943 v864 >= 1 0
971 v6164 == 1 0
981 v6164== 1 0
991 v6164== 3 0
1001 v6164 == 3 0
1011 v6164 == 4 0
1021 v6164 == 4 0
1120 v1110 == -1 || v1110 == 0 0
1130 v1110 == -1 || v1110 == 0 0
funpack/configs/fmrib/processing.tsv
: Processing rules
applied to the data set after all cleaning stages have been
performed. See here for an
overview of all of the built-in processing functions.Variable Process
# Convert these variables into a set of binary columns, one
# column per unique value. The metaproc fields are used to
# generate useful descriptions for each column.
6150,6155,20003,20199 binariseCategorical(acrossVisits=True, acrossInstances=True, metaproc='codingdesc')
20001,20002,20004,40011,40012 binariseCategorical(acrossVisits=True, acrossInstances=True, metaproc='hierarchycodedesc')
40001,40002,40006,40013,41200,41201,41210,41256,41258,41272,41273 binariseCategorical(acrossVisits=True, acrossInstances=True, metaproc='hierarchycodedesc')
# ICD9/10 diagnoses are binarised, but instead of having
# binary 1/0 values, each column contains the date of
# diagnosis, or '0' indicating no diagnosis. After this
# step, the date columns 41280/41281 will be removed
# from the data set.
41270,41271 binariseCategorical(acrossVisits=True, acrossInstances=True, metaproc='hierarchycodedesc', fillval=0, take=[41280,41281])
# Other vars to be binariesd
23165,29000,29001 binariseCategorical
# Sparsity check - for most data fields, columns will be
# dropped if they do not meet these criteria:
#
# - have at least 51 non-na values
# - have a stddev >1e-6
#
# Categorical columns will be dropped if one category
# comprises 99% of the data.
#
# We exclude a subset of categories containing data fields
# of secondary/auxillary interest, which we always want
# present in the output.
all_except,6150,6155,20001,20002,20003,20004,20199,40001,40002,40006,40011,40012,40013,41200,41201,41202,41203,41204,41205,41210,41256,41258,41270,41271,41272,41273,cat1,cat31,cat60,cat70,cat97,cat98,cat99 removeIfSparse(minpres=51, maxcat=0.99, minstd=1e-6, abscat=False)
# Binarised vars are subjected to a slightly adjusted sparsity
# check - we drop columns which don't have at least 10 diagnoses
# (or which have less than 10 non-diagnoses). Note that this
# does not include main ICD9/10 columns (411270, 41271), which
# are tested separately below.
6150,6155,20001,20002,20003,20004,20199,40001,40002,40006,40011,40012,40013,41200,41201,41210,41256,41258,41272,41273 removeIfSparse(mincat=10)
# At this point, the main ICD vars will contain either
# a date, or nan (fillval=0, used above, is only applied at
# export), so a minpres test will suffice.
41270,41271 removeIfSparse(minpres=10)
# Drop columns which are correlated with other columns (the one
# with more missing values is dropped). If processing
# unknown/uncategorised data fields, do not remove columns that
# are redundant w.r.t. those unknowns/uncategorised.
#
# We exclude ICD9/10 primary/secondary diagnosis columns - they
# will be ingested in the createDiagnosisColumn step below.
#
# We also exclude a subset of categories containing data fields
# of secondary/auxillary interest, which we always want
# present in the output.
all_except,41202,41203,41204,41205,cat1,cat31,cat60,cat70,cat97,cat98,cat99 removeIfRedundant(0.99, 0.2, skipUnknowns=True)
# Create binary columns for ICD9/ICD10 diagnoses,
# denoting each diagnosis as being either primary or
# secondary. After this step, the primary/secondary
# diagnosis data fields (41202, 41203, 41204, 41205)
# will be removed from the data set, and replaced with
# binary columns denoting each diagnosis as either
# primary or secondary.
41270 createDiagnosisColumns(41202, 41204, binarised=True)
41271 createDiagnosisColumns(41203, 41205, binarised=True)
Some additional configuration profiles enhance the fmrib
profile with
some extra options.
The fmrib_standard
profile incorporates all options from the fmrib
profile, however it only load datafields from the categories listed in
fmrib_cats.cfg
, and it also configures funpack
to output verbose
logging information and additional summary files.
fmrib_standard.cfg
#
# FUNPACK "fmrib_standard.cfg" configuration file
#
# This is the configuration used for standard internal FMRIB runs.
config_file fmrib_logs
config_file fmrib_cats
fmrib_logs.cfg
#
# FUNPACK "fmrib_logs.cfg" configuration file
#
# Provides FMRIB-recommended processing of UKB tables, with maximal generation
# of ancillary and diagnostic files and log information
#
# Use all settings in fmrib.cfg
config_file fmrib
# Produce maximal diagnostic message reporting
noisy
noisy
noisy
# Send all output to a file
write_log
# Create a table recording the mapping of alpha-numeric ICD10 codes to pure
# numeric
write_icd10_map
# Create a table with the descriptions of each variable in the output file
write_description
# Create a summary of all proccessing steps applied
write_summary
fmrib_cats.cfg
#
# FUNPACK "fmrib_cats.cfg" configuration file
#
# Provides recommended set of categories to process, from full list of
# FMRIB-curated categories (see funpack/configs/fmrib/categories.tsv)
#
category 1
category 2
category 3
category 10
category 11
category 12
category 13
category 14
category 20
category 21
category 22
category 23
category 24
category 25
category 26
category 30
category 31
category 32
category 50
category 51
category 52
category 60
category 70
category 97
category 98
category 99
exclude_category 100
The fmrib_new_release
profile is the same as the fmrib_standard
profile, but also load and process any unknown or uncategorised
datafields. This is useful when processing a new data set which may contain
newly added data fields that you have not yet encountered.
fmrib_new_release.cfg
#
# FUNPACK "fmrib_new_release.cfg" configuration file
#
# Intended for use when first running FUNPACK on a new UKB data release.
# Equivalent to "fmrib_logs", but loads and processes *all* variables
# (including previously unknown and uncategorised variables), and outputs a
# separate file containing summary information about these variables.
#
# base this config on fmrib_logs,
# and include all fmrib categories
config_file fmrib_logs
config_file fmrib_cats
# Include unknown/uncategorised variables
category unknown
category uncategorised
# Save a summary of unknown vars
write_unknown_vars