Bookmarks
Genomics England - Secondary Data - Cancer Specific Curated Datasets - Pilot
Population Size
Years
2008 - 2022
Associated BioSamples
None/not available
Geographic coverage
Lead time
2-6 months
Summary
Documentation
Genomics England are striving to improve the clinical data provided for its researchers. We understand the value of accurate and granular clinical data, especially in the context of cancer.
In order to deliver this, we are planning a series of pilot datasets, aiming to incorporate additional clinical data provided by Public Health England cancer registry (NCRAS). Genomics England will aim to deliver cancer specific datasets, with the initial focus being on providing a broad pathological understanding. This will aim to incorporate data points such as molecular mutations and resection margins in pathology reports. The focus will then incorporate radiological imaging reports and finally focus on live/ up-to-date clinical data. In addition, we are also including the date each participant was last seen alive (data provided up to October 2020) and dates and causes of death to aid with outcomes.
It must be stressed that this work is a development process, and we are working in unison with NCRAS to progress this. Whilst we do not possess the extensive experience and resource of Public Health England, we are developing a natural language based algorithm for focused data extraction. NCRAS have a dedicated team to curating clinical data and the gold standard remains the NCRAS curated tables. However, for this dataset to improve and move forward, Genomics England are keen for feedback and for you to highlight areas for improvement.
You will note subtle differences to the structure of the table compared to the curated NCRAS tables and thus additional data dictionaries have been provided. Genomics England hopes to continue developing this uncurated live dataset with feedback and look forward to hearing your thoughts. Please reach out to us with related thoughts and suggestions via the Genomics England Service Desk, including "cancer_specific_datasets_pilot" in the title of your enquiry.
With the addition of the new pathology_reports dataset introduced in v16, the aml_path_reports and testes_path_reports datasets have been deprecated in v17.
Keywords
Observations
Observed Node | Disambiguating Description | Measured Value | Measured Property | Observation Date |
---|---|---|---|---|
Persons | Cancer Tumour - Number of genomes | Not reported | Count | 01 Jan 1970 |
Persons | Rare Disease Participants | Not reported | Count | 01 Jan 1970 |
Persons | Cancer Germline - Number of genomes | Not reported | Count | 01 Jan 1970 |
Persons | Cancer Participants | Not reported | Count | 01 Jan 1970 |
Persons | Rare Disease - Number of genomes | Not reported | Count | 01 Jan 1970 |
Provenance
Structural Metadata
Details
08/10/2024
30/03/2023
Coverage
04/09/2008
30/12/2022
Accessibility
Data Access Request
More information about the Genomics England Research Environment can be found here:
https://www.genomicsengland.co.uk/about-genomics-england/research-environment/
Genomics England 100k participants have consented to longitudinal lifetime followup and recontact safely through our clinical network. BRST (Bioinformatics Research Services) are a team of bioinformatics who know the dataset inside out and provide consultancy projects on a case by case basis. Our network of clinical and medical experts can be made available on case by case basis. Researchers have the opportunity to work with our and access the GeCIP network who are a community of world-leading experts in specific cancers and rare diseases.