Bookmarks

UK Biobank

Population Size

500,000

People

Population Size statistic card

Years

2006

Years statistic card

Associated BioSamples

Serum

Plasma

...see more

Associated BioSamples statistic card

Geographic coverage

United Kingdom

Geographic coverage statistic card

Lead time

1-2 months

Lead time statistic card

Summary

UK Biobank is a large-scale biomedical database and research resource that provides researchers access to detailed longitudinal phenotype, medical and genetic data from 500,000 volunteer participants.

Documentation

UK Biobank is a large-scale biomedical database and research resource, containing in-depth genetic and health information from half a million UK participants. The database, which is regularly augmented with additional data, is globally accessible to approved researchers and scientists undertaking vital research into the most common and life-threatening diseases. UK Biobank’s research resource is a major contributor to the advancement of modern medicine and treatment and has enabled several scientific discoveries that improve human health.

Since 2006, UK Biobank has collected an unprecedented amount of biological and medical data on half a million people, aged between 40 and 69 years old and living in the UK, as part of a large-scale prospective study. With their consent they regularly provide blood, urine and saliva samples, as well as detailed information about their lifestyle which is then linked to their health-related records to provide a deeper understanding of how individuals experience diseases. Genotyping, whole exome sequencing and whole genome sequencing is available for the whole cohort. Blood and urine biomarkers, telomere data, metabolomic and proteomic data and infectious disease markers have been assayed from the samples provided.

Since 2014 we have been undertaking the largest imaging study to date. We aim to undertake brain, cardiac and neck to knee MRI, whole body DXA and carotid ultrasound of 100,000 participants. We additionally have retinal images for 100,000 participants from baseline assessment, and accelerometer data for 100,000 participants collected 2013-2014.

Questionnaires that aim to capture data that is not readily captured by health data linkages are regularly sent to our participants.

The data – the largest and richest dataset of its kind – is de-identified and made widely accessible by UK Biobank to registered researchers around the world who use it to make new scientific discoveries about common and life-threatening diseases – such as cancer, heart disease and stroke – in order to improve public health.

Dataset type

Health and disease

Dataset sub-type

Not applicable

Dataset population size

500000

Keywords

Observations

Observed Node

Disambiguating Description

Measured Value

Measured Property

Observation Date

Persons

Each participant has a large number (<5000) of data points associated with them. Recruitment started in 2006, but data collection is ongoing, and health data predates recruitment date. Summary statistics of all data can be found on our data showcase.

500000

Count

13 Mar 2006

Provenance

Purpose of dataset collection

Study

Collection source setting

Primary care - Clinic, Secondary care - Accident and Emergency, Secondary care - In-patients, Community, Clinic, Prescribing - Community pharmacy

Patient pathway description

UK Biobank is a volunteer based cohort. As such, there is a healthy volunteer effect that results in participants tending to be of higher socioeconomic status, remaining in education longer, slimmer, less smokers (although those that smoke tend to be heavier smokers) and lower consumers of alcohol than the general population. A comparison between UK Biobank participants and the general UK population has been published (https://doi.org/10.1093/aje/kwx246).

Whilst selection biases are seen in UK Biobank

, there is still substantial heterogeneity within the cohort. Whilst incidence and prevalence calculations are not generalisable to the UK population, exposure-outcome comparisons should be due to the heterogeneity in the cohort. However, it is important that researchers consider the potential biases of a data set that might limit generalisability of their results (as is the case for all observational data).

Image contrast

Not stated

Biological sample availability

Serum,Plasma,Whole blood,Saliva,Urine

Structural Metadata

Details

Publishing frequency

Continuous

Version

2.0.0

Modified

08/10/2024

Citation Requirements

UK Biobank

Coverage

Start date

13/03/2006

Time lag

Variable

Geographic coverage

United Kingdom

Minimum age range

40

Maximum age range

69

Follow-up

Continuous

Accessibility

Language

en

Controlled vocabulary

LOCAL, OPCS4, READ, SNOMED CT, DM+D, ICD10, ICD9

Format

Text/csv, dta, SAS, R, Image/ DICOM, NIFTI, PNG, Other/ VCF, CRAM, PLINK, BGEN, BED, CWA

Data Access Request

Dataset pipeline status

Not available

Time to dataset access

1-2 months

Access method category

Varies based on project

Access service description

Applications to access data are made through our bespoke access management system (https://bbams.ndph.ox.ac.uk/ams/).

Data access is either via data download (phenotype and genotype data) or via our Research Analysis Platform (phenotype, imaging, genotype, WES, WGS, omics). Our RAP is enabled by DNANexus and hosted by Amazon Web Services (https://www.ukbiobank.ac.uk/enable-your-research/research-analysis-platform).

Access costs depend on what data access is required.

Jurisdiction

GB-ENG

Data use limitation

General research use

Data use requirements

Institution-specific restrictions,Project-specific restrictions,Publication required,Return to database or resource,User-specific restriction,Time limit on use

Data Controller

UK Biobank

Data Processor

UK Biobank

Dataset Types: Health and disease


Collection Sources: Primary care - Clinic, Secondary care - Accident and Emergency, Secondary care - In-patients, Community, Clinic, Prescribing - Community pharmacy

end of page