HDR UK Gateway
HDR Gateway logo

Bookmarks

ID-468: AI-DIP Artificial Intelligence for Cancer Diagnosis in General Practice

Safe People

Organisation name

Imperial College London

Applicant name(s)

Brendan Delaney

Funders/ Sponsors

Safe Projects

Project ID

ID-468-1

Lay summary

This application relates to one work stream of a large NIHR i4i Office of Life Sciences Cancer Early Detection project.

Public benefit statement

MedALBERT is a bespoke pre-trained Transformer model in the WSIC analysis space, where sequences of READ/SNOMED codes leading up to Lung Cancer diagnosis and control cases have been pre-processed to reduce to 450 code groups and are represented in the model as ‘embeddings’. Embeddings are mathematical relations between the data structures. These models can theoretically be subject to hacking that could potentially identify unusual relationships in the embeddings which poses a theoretical risk to data privacy. As such they need to be managed in appropriate environments, i.e designed to restrict data access to avoid hacking attempts and subject to information governance approval. There is not currently a process for this, but we would like to use the DARS request as the MedALBERT model ‘represents’ the WSIC data.

Other approval committees

Project start date

28/07/2025

Project end date

28/01/2026

Latest approval date

25/09/2025

Safe Data

Dataset(s) name

Data sensitivity level

De-Personalised

Release/Access date

20/10/2025

Safe Setting

Access type

TRE

Safe Outputs

Link to research outputs