Introduction to data linkage and analysing linked data – Oxford

Computer software and computer workshops This event includes computer workshops. Participants will need to bring their own laptops with a Windows operating system with Excel, and LinkPlus software (freely available from the Centers for Disease Control and Prevention website) Please note LinkPlus is not compatible with Macs.

This is a training and capacity building event co-organised by the Administrative Data Research Centre England (ADRC_E) and the Consumer Data Research Centre (CDRC).

This short course is designed to give participants a practical introduction to data linkage and is aimed at researchers either intending to use data linkage themselves or those who want to understand more about the process so that they can analyse linked data. Introduction to Data Linkage will cover examples of the uses of data linkage, data preparation, and methods for linkage (including deterministic and probabilistic approaches and privacy-preserving linkage).

The main focus of this course will be health data, although the concepts will apply to many other areas. This course includes a mixture of lectures and practical sessions that will enable participants to put theory into practice.

We recommend this course is booked in conjunction with the separate course ‘Evaluating linkage quality for the analysis of linked data’ on the 11 May 2018, which will cover processing of linked data, concepts of linkage error and bias, and handling linkage error in analysis. It can also be booked as a separate one day course.

Target Audience:
The course is aimed at researchers who need to gain an understanding of data linkage techniques and of how to analyse linked data. The course provides an introduction to data linkage theory and methods for those who might be using linked data in their own work. Participants may be academic researchers in the social and health sciences or may work in government, survey agencies, official statistics, for charities or the private sector.

The course does not assume any prior knowledge of data linkage. Some experience of using Excel or other software will be useful for the practical session.

Course content:

- Overview of data linkage (data linkage systems, benefits of data linkage, types of projects)

- Linkage methods (deterministic and probabilistic, privacy-preserving)

- The linkage process (data preparation, blocking, classification)

- Overview of linkage error

- Practical sessions

Learning outcomes:

- Understand the background and theory of data linkage methods

- Prepare data for linkage

- Perform deterministic and probabilistic linkage

Course outline:

- Introduction

- Preparing data for linkage

- Deterministic linkage, privacy-preserving linkage

- Linkage error

- Advanced linkage methods including probabilistic linkage

- Practical session

- Multiple files and emerging methods