Data Study Group Final Report: Department for Work and Pensions


This challenge explored methods to gauge the suitability of synthetic data (including particular datasets provided by two commercial teams). The methods for synthesising data that we are considering start from an original, sensitive dataset. This raises two key questions. First: How well is the privacy of individuals present in the original dataset protected? (alternatively, how much can be inferred about the original dataset from the synthetic data?) Second: How suitable is the synthetic data as a substitute for the original data, for its intended uses? The latter we refer to as its utility.

This aim of this challenge is to explore these questions, with a focus on (but not limited to) several synthetic datasets provided by DWP, and how issues of privacy and utility trade off against one another.

Alan Turing Institute Data Study Group