Bookmarks

Creating Tools to Identify Common and Rare Diseases in Complex Health Data

Safe People

Organisation name

University College London

Organisation sector

Academic Institute

Applicant name(s)

Spiros Denaxas

Funders/ Sponsors

Safe Projects

Project ID

OFHS240157

Lay summary

When you visit a hospital or your general practitioner, details about your health like medical test results, medications, symptoms and conditions you may be diagnosed with, are saved on a computer system in electronic form – known as an electronic health record. These electronic records contain a lot of useful information that researchers can study, in a safe and secure way, to answer questions that can help improve healthcare and treatments for patients. However, health data can be tricky to work with. Different doctors and hospitals may record the same condition, like a diagnosis of asthma, using different methods. This makes it hard for researchers to work with the data consistently. To solve this problem, we will create tools to make sure that people with a disease are identified in the data in the same way and are not labelled as having the disease when they do not. Researchers will also be able to use the tools to help distinguish people with different types of the same disease such as different types of diabetes. This will make research much more efficient. When individuals are admitted to hospital, or have a consultation with their GP, information about their diagnoses, test results, medications and symptoms is recorded electronically in a computer (called electronic health records). These electronic health records contain a wealth of information that researchers can then responsibly use to answer important scientific questions that can improve human health and healthcare. Health data are however very complex as they are recorded by different healthcare professionals in different settings in different ways - for example, your GP will record a diagnosis of hypertension in the computer system using a different set of codes than a hospital doctor would use. It is therefore important to create tools that enable researchers to accurately define diseases in such complex data so the information can be used for research. Highly-accurate disease definitions correctly identify all individuals that have the disease and do not incorrectly identify individuals without the disease as having it. Defining diseases and different types of the same disease can enable researchers to examine how the same disease differs across different patients.

Public benefit statement

Enabling researchers to use accurate definitions of human diseases in their research is valuable because it will directly improve the quality of the research findings they produce. The tools and information provided from this study will be able to be used to enhance and support other research that can improve human health and healthcare , for example, by accurately identifying which people will benefit from a particular drug or helping create better drugs for patients who do not currently benefit from existing treatments. Our methods are reusable and will contribute to accelerate research in studies with similar structure and data sources.

Request category type

Public Health Research

Other approval committees

Project start date

08/05/2025

Latest approval date

08/05/2025

Safe Data

Dataset(s) name

Safe Setting

Access type

TRE

Safe Outputs

Link to research outputs

end of page