Bookmarks
CPRD Cardiovascular Disease Synthetic Dataset
Population Size
499,344
People
Years
2020
Associated BioSamples
None/not available
Geographic coverage
United Kingdom
Lead time
1-2 months
Summary
DOI for dataset
Documentation
This wholly synthetic dataset is based on real anonymised primary care patient data extracted from the CPRD Aurum database and focuses on cardiovascular disease risk factors. Researchers will not be able to access the real anonymised patient data extract which was used as the basis for the synthetic dataset generation to preserve patient privacy. The ground truth data extract was subject to data pre-processing and as such, the synthetic dataset, which is based on this, does not reflect the structure of the source CPRD Aurum database. This synthetic dataset was developed as part of a project funded by the Regulators’ Pioneer Fund launched by The Department for Business, Energy and Industrial Strategy (BEIS) and managed by Innovate UK. The methodology used to generate and evaluate this synthetic dataset is outlined in Wang et al. 2019.
Dataset type
Dataset sub-type
Dataset population size
Associated media
Keywords
Observations
Observed Node | Disambiguating Description | Measured Value | Measured Property | Observation Date |
---|---|---|---|---|
Persons | Patients in the dataset | 499344 | COUNT | 28 Jun 2020 |
Provenance
Purpose of dataset collection
Collection source setting
Patient pathway description
Image contrast
Biological sample availability
Structural Metadata
Details
Publishing frequency
Version
Modified
08/10/2024
Distribution release date
28/06/2020
Citation Requirements
Coverage
Start date
25/03/2020
Time lag
Geographic coverage
Maximum age range
Follow-up
Accessibility
Language
Controlled vocabulary
Format
Data Access Request
Dataset pipeline status
Access rights
Time to dataset access
Access request cost
Access method category
Access service description
Jurisdiction
Data use limitation
Data use requirements
Data Controller
Data Processor