HDR UK Gateway
HDR Gateway logo

Bookmarks

OpenGWAS

Population Size

1,000,000

People

Population Size statistic card

Years

2018

Years statistic card

Associated BioSamples

None/not available

Associated BioSamples statistic card

Geographic coverage

Worldwide

Geographic coverage statistic card

Lead time

Not applicable

Lead time statistic card

Summary

OpenGWAS integrates and harmonises genome-wide association study (GWAS) summary statistics published by research groups and consortia worldwide. The OpenGWAS platform provides an application programming interface (API) alongside a suite of Python and R packages to support extraction and analysis of data.

Documentation

Developed at the MRC Integrative Epidemiology Unit (IEU) at the University of Bristol, this resource is a manually curated collection of complete GWAS summary datasets made available as open source files for download, or by querying a database of the complete data.

For full details, including dataset contributors and researchers who have developed the resource see our website.

Dataset history

This project began as the underlying database for the MR-Base (decommissioned) and LD Hub projects. These data now serve as an input source to a wider number of analytical tools that implement methods such as Mendelian randomization, fine mapping, colocalisation, GWAS visualisation etc. Please see the API page for a list of R and Python packages that will connect to the data.

Dataset access

Access to the resource is currently free of charge for academic researchers, subject to fair use limits to avoid overloading our services. Users can register and receive an API token directly on the website.

The database comprises mainly publicly available datasets, but also includes a number of private datesets whose access is controlled through OAuth2.0 authentication. Please contact us if you have datasets that you would like to add, whether public or private.

We have made all the public data available for download. We are using the GWAS VCF format to store the GWAS summary data to ensure alignment with the hg19 reference sequence, and to enable very fast querying. More information is available in the GWAS-VCF specification and on biorxiv.

A list of all datasets, along with their meta-data and IDs, is also available through the API, R or Python packages, so it should be straightforward to create a subset of datasets to be downloaded through a loop. Note that there will be a relatively higher cost (deducted from your periodic allowance, usually given free of charge) to do this, as we pay money to the cloud service provider for the storage, requests handling and outbound bandwidth.

Dataset type

Health and disease, Measurements/Tests, Omics

Dataset population size

1000000

Synthetic data web link

Keywords

Dataset and BioSample Aliases

Observations

Observed Node

Disambiguating Description

Measured Value

Measured Property

Observation Date

Findings

Total number of GWAS datasets

50069

COUNT

17 Mar 2026

Provenance

Purpose of dataset collection

Other, Study

Source of data extraction

Other

Collection source setting

Cohort, study, trial, Clinic, Other

Patient pathway description

The dataset covers a wide range of human phenotypic traits relating to many different health outcomes.

Image contrast

Not stated

Biological sample availability

None/not available

Structural Metadata

Details

Publishing frequency

Continuous

Version

1.0.0

Modified

17/03/2026

Citation Requirements

MRC Integrative Epidemiology Unit, University of Bristol

Coverage

Start date

29/05/2018

Time lag

Variable

Geographic coverage

Worldwide

Maximum age range

150

Follow-up

Unknown

Dataset completeness

Accessibility

Language

en

Alignment with standardised data models

LOCAL

Controlled vocabulary

OTHER

Format

GWAS-VCF, text/csv

Data Access Request

Dataset pipeline status

Available

Time to dataset access

Not applicable

Access request cost

No cost for academic research users

Access method category

Direct access

Access service description

OpenGWAS is a cloud-based data aggregation and delivery service (OpenGWAS or the platform) provided by the University of Bristol (UoB) that aggregates genome-wide association study datasets (Datasets) and makes these accessible through an application programming interface (API). For further details, see: https://opengwas.io/policies/service-description

Data use limitation

Research use only

Data use requirements

Not for profit use

Data Controller

NIHR Bristol Biomedical Research Centre

Data Processor

NIHR Bristol Biomedical Research Centre

Dataset Types: Health and disease, Measurements/Tests, Omics


Collection Sources: Cohort, study, trial, Clinic, Other

Publications about this dataset

The MR-Base platform supports systematic causal inference across the human phenome.Hemani G, Zheng J, Elsworth B, Wade KH, Haberland ...

eLife

Published - 2018