Bookmarks
CPRD Cardiovascular Disease Synthetic Dataset
Population Size
499,344
People
Years
2020
Associated BioSamples
None/not available
Geographic coverage
United Kingdom
Lead time
1-2 months
Summary
Documentation
This wholly synthetic dataset is based on real anonymised primary care patient data extracted from the CPRD Aurum database and focuses on cardiovascular disease risk factors. Researchers will not be able to access the real anonymised patient data extract which was used as the basis for the synthetic dataset generation to preserve patient privacy. The ground truth data extract was subject to data pre-processing and as such, the synthetic dataset, which is based on this, does not reflect the structure of the source CPRD Aurum database. This synthetic dataset was developed as part of a project funded by the Regulators’ Pioneer Fund launched by The Department for Business, Energy and Industrial Strategy (BEIS) and managed by Innovate UK. The methodology used to generate and evaluate this synthetic dataset is outlined in Wang et al. 2019.
Keywords
Observations
Observed Node | Disambiguating Description | Measured Value | Measured Property | Observation Date |
---|---|---|---|---|
Persons | Patients in the dataset | 499344 | COUNT | 28 Jun 2020 |
Provenance
Structural Metadata
Details
08/10/2024
28/06/2020
Coverage
25/03/2020