Bookmarks
ID-468: AI-DIP Artificial Intelligence for Cancer Diagnosis in General Practice
Safe People
Organisation name
Imperial College London
Applicant name(s)
Brendan Delaney
Funders/ Sponsors
Safe Projects
Project ID
ID-468-1
Lay summary
This application relates to one work stream of a large NIHR i4i Office of Life Sciences Cancer Early Detection project.
Public benefit statement
MedALBERT is a bespoke pre-trained Transformer model in the WSIC analysis space, where sequences of READ/SNOMED codes leading up to Lung Cancer diagnosis and control cases have been pre-processed to reduce to 450 code groups and are represented in the model as ‘embeddings’. Embeddings are mathematical relations between the data structures. These models can theoretically be subject to hacking that could potentially identify unusual relationships in the embeddings which poses a theoretical risk to data privacy. As such they need to be managed in appropriate environments, i.e designed to restrict data access to avoid hacking attempts and subject to information governance approval. There is not currently a process for this, but we would like to use the DARS request as the MedALBERT model ‘represents’ the WSIC data.
Other approval committees
Project start date
28/07/2025
Project end date
28/01/2026
Latest approval date
25/09/2025
Safe Data
Dataset(s) name
Data sensitivity level
De-Personalised
Release/Access date
20/10/2025
Safe Setting
Access type
TRE