HDR UK Gateway
HDR Gateway logo

Bookmarks

Leeds Teaching Hospitals OMOP Database

Population Size

1,500,000

People

Population Size statistic card

Years

2003 - 2025

Years statistic card

Associated BioSamples

None/not available

Associated BioSamples statistic card

Geographic coverage

E08000035

Geographic coverage statistic card

Lead time

2-6 months

Lead time statistic card

Summary

The Leeds Teaching Hospitals NHS Trust OMOP database offers a comprehensive, research-ready dataset derived from the electronic health records of patients diagnosed and treated at Leeds Teaching Hospitals NHS Trust since 2003. Structured in the OMOP Common Data Model, it encompasses extensive demographic and clinical data, and includes a wealth of cancer-specific information, allowing multi-centre epidemiological and outcomes research through secure, federated analytics and OHDSI tools.

Documentation

The Leeds Teaching Hospitals NHS Trust (LTHT) OMOP database is a robust, longitudinal dataset constructed using data from the electronic health records (EHR) of patients treated and diagnosed at Leeds Teaching Hospitals NHS Trust since 2003. This comprehensive resource is mapped to the OMOP CDM, ensuring interoperability with other OMOP databases, and enabling privacy-preserving, large-scale, multi-centre studies.

Encompassing a wide array of clinical data, the database includes information on demographics, diagnoses, procedures, medications and laboratory results. A particular strength lies in its detailed cancer-specific data, which supports in-depth analyses of treatment outcomes, survival rates, and disease progression. This makes it an invaluable resource for researchers focusing on oncology, as well as those interested in broader secondary care settings.

Researchers can draw insights from the LTHT OMOP database through federated analytics approaches as well as through the use of standardised OHDSI tools, which enable secure, privacy-preserving analyses across multiple institutions, eliminating the need to access individual-level patient data.

Notably, the LTHT OMOP database has been instrumental in several high-profile studies:

• HERON Network: LTHT is a member of the HERON network, funded by HDR UK, which focuses on enhancing the quality and impact of cancer research through federated analytics. LTHT participated in a study examining the use of antibiotics which are in the WHO watchlist for high risk of antimicrobial resistance. • DigiONE Pilot Studies: These studies analyse harmonised routine care data from OMOP databases in 6 digitally mature European hospitals. Three studies have been conducted to date, focusing on the impact of the COVID-19 pandemic on cancer care, on metastatic non-small cell lung cancer, and on HER2-/HR+ metastatic breast cancer. • FALCON-Lung Study: This study focused on the uptake of immune checkpoint inhibitors for metastatic non-small cell lung cancer across the world, and implemented a clinically validated line of therapy algorithm using systemic anti-cancer therapy data in the OMOP databases of 17 international institutions.

In summary, the LTHT OMOP database stands as a robust resource for secondary care research, particularly in oncology. Its comprehensive, high-quality data, combined with a commitment to national and international collaboration, positions it as a cornerstone for advancing healthcare research and improving patient outcomes.

The LTHT OMOP database consists of the following tables and data:

• Visit occurrence: includes inpatient and outpatient admissions for all patients that are or have been part of the cancer pathway, as well as all in-patient admissions for all other patients. The visit_detail table has not been populated. • Condition occurrence: populated with all diagnoses in the Trust since 2003. • Drug exposure: populated. Includes all anti-cancer drugs (chemotherapy and immunotherapy), and selected antibiotics medication (all antibiotics that are in the WHO watchlist for antimicrobial resistance, as well as access antibiotics). Plans to extend this to all medication prescribed. • Procedure occurrence: populated. Includes surgical and radiotherapy procedures delivered to patients with cancer, as well as all surgical procedures delivered to all other patients. • Measurement: populated with weight, height, TNM staging, performance status, and metastasis location data. • Observation: populated with ethnicity, IMD quintile, clinical trial participation (cancer only) and cancer histology data. • Device exposure: not populated. • Death: populated from ONS.

Dataset type

Health and disease, Treatments/Interventions, Measurements/Tests, Socioeconomic

Dataset sub-type

Cancer, Cardiovascular, Rare diseases, Metabolic and endocrine, Respiratory, Musculoskeletal, Renal and urogenital, Pathology, Ethnicity, Deprivation, Births and deaths

Dataset population size

1500000

Keywords

Observations

Observed Node

Disambiguating Description

Measured Value

Measured Property

Observation Date

Persons

Given total of distinct PERSON_ID in the OMOP PERSON table

1500000

Count

07 May 2025

Events

Total count of diagnosis

8000000

Count

06 May 2025

Events

Number of visits (in-patient/out-patient)

13000000

Count

06 May 2025

Provenance

Purpose of dataset collection

Care

Source of data extraction

EPR

Collection source setting

Clinic, Primary care - Clinic, Secondary care - Accident and Emergency, Secondary care - Outpatients, Secondary care - In-patients

Patient pathway description

The LTHT OMOP database encompasses the full spectrum of patient pathways within secondary care at Leeds Teaching Hospitals NHS Trust. This includes comprehensive data on all patients treated and diagnosed across various specialties, with a particular emphasis on oncology. The cancer-specific data is extensive, covering detailed information on cancer diagnoses, staging, morphology, biomarkers, treatments, and outcomes, thereby supporting detailed analyses of disease natural history,

disease progression and treatment efficacy.

In addition to cancer care

, the database also captures a wide range of non-cancer conditions, providing a holistic view of patient journeys through secondary care. This dual focus enables researchers to explore comparative effectiveness, identify care gaps, and develop predictive models that can inform both cancer and non-cancer care strategies.

Image contrast

Not stated

Biological sample availability

None/not available

Structural Metadata

Details

Publishing frequency

Bimonthly

Version

1.0.0

Modified

15/05/2025

Distribution release date

07/05/2025

Citation Requirements

Leeds Teaching Hospitals NHS Trust (LTHT)

Coverage

Start date

01/01/2003

End date

07/05/2025

Time lag

Variable

Geographic coverage

E08000035

Maximum age range

150

Follow-up

Continuous

Accessibility

Language

en

Alignment with standardised data models

OMOP

Controlled vocabulary

OPCS4, SNOMED CT, ICD10, ICDO3, LOCAL, DM+D

Format

sql, csv

Data Access Request

Dataset pipeline status

Available

Time to dataset access

2-6 months

Access request cost

Varies upon data request

Access method category

Varies based on project

Access service description

Access to the LTHT OMOP database is governed by strict data governance protocols to ensure patient privacy and compliance with ethical standards. Individual-level patient data is not accessible to external researchers; instead, analyses are conducted through federated platforms, such as Vantage6, and/or using standardised OHDSI tools, which allow for secure, privacy-preserving research. Researchers interested in drawing insights from the LTHT OMOP database without accessing individual-level patient data can initiate the process by contacting the LTHT research data and informatics team (R-DIT) to discuss their project goals and requirements.

Jurisdiction

UK

Data use limitation

General research use,Commercial research use,No linkage

Data use requirements

Collaboration required,Institution-specific restrictions,Project-specific restrictions,User-specific restriction

Data Controller

Leeds Teaching Hospitals NHS Trust (LTHT)

Data Processor

Leeds Teaching Hospitals NHS Trust (LTHT)

Dataset Types: Health and disease, Treatments/Interventions, Measurements/Tests, Socioeconomic

Dataset Sub-types: Cancer, Cardiovascular, Rare diseases, Metabolic and endocrine, Respiratory, Musculoskeletal, Renal and urogenital, Pathology, Ethnicity, Deprivation, Births and deaths


Collection Sources: Clinic, Primary care - Clinic, Secondary care - Accident and Emergency, Secondary care - Outpatients, Secondary care - In-patients

Publications about this dataset

OMOP for oncology data: a single-centre and network perspective – OHDSI Europe Symposium 2023Stelios Theophanous, Kieran Zucker, Louise Hick, E...

OHDSI Europe symposium

Published - 2023