Extracting Variables from the UK Biobank

Lei Clifton’s talk is suitable for researchers (MSc, DPhil, postdocs, clinicians, etc.) who would like to download and use UK Biobank data. Previous experience on UKB data is not required. Lei Clifton has written a step-by-step guide (www.ndph.ox.ac.uk/files/research/ukb_dataguide_download_lc_v0-5_08jul2020.pdf) on the first steps of handling UK Biobank data. In the first part of the session, Lei will take you through this guide, including how to run file handlers (ukbmd5 etc.), extract the user-specified fields and then label the extracted variables.

The aim of the session is to learn the first steps of handling UK Biobank data.

Lei’s session will cover the following topics:

Download useful documents on data exaction from UKB Showcase website

How to run file handlers (ukbmd5 etc.)

How to extract the user-specified fields

How to label the extracted variables

At the end of the session, participants will be able to:

Specify data fields using “vlookup” in Excel

Download the file handlers from UKB website

Run these file handlers to download UKB data

Extract variables from UKB data

Paul McCarthy’s session will introduce FUNPACK, which is a command-line program which can be used to extract data from UK Biobank (and other tabular) data. A large number of rules are built into FUNPACK which are specific to the UK Biobank data set. But you can control and customise

everything that FUNPACK does to your data, including which rows and columns to extract, and which cleaning/processing steps to perform on each column.

The session will comprise a brief introduction to FUNPACK, and an interactive demonstration showing what FUNPACK can be used for.

Paul’s session will cover the following topics:

The nature of UK Biobank data files Installing FUNPACK Common tasks that can be performed with FUNPACK

The objective of this session is to show participants how they can use FUNPACK to extract variables of interest from a UK Biobank data file.

At the end of the session, participants will be able to use FUNPACK to:

Extract a sub-set of variables or subjects from a large tabluar UK BioBank data file

Apply recoding and/or binarisation rules to categorical variables

Apply replacement rules to hierarchical variables (e.g. ICD10)