HDR Gateway logo
HDR Gateway logo

Bookmarks

Genomics England - Secondary Data - Cancer Specific Curated Datasets - Pilot

Population Size

Not reported

Years

2008 - 2022

Associated BioSamples

None/not available

Geographic coverage

Not reported

Lead time

2-6 months

Summary

Secondary data curated by Genomics England, cancer-specific.

Documentation

Genomics England are striving to improve the clinical data provided for its researchers. We understand the value of accurate and granular clinical data, especially in the context of cancer.

In order to deliver this, we are planning a series of pilot datasets, aiming to incorporate additional clinical data provided by Public Health England cancer registry (NCRAS). Genomics England will aim to deliver cancer specific datasets, with the initial focus being on providing a broad pathological understanding. This will aim to incorporate data points such as molecular mutations and resection margins in pathology reports. The focus will then incorporate radiological imaging reports and finally focus on live/ up-to-date clinical data. In addition, we are also including the date each participant was last seen alive (data provided up to October 2020) and dates and causes of death to aid with outcomes.

It must be stressed that this work is a development process, and we are working in unison with NCRAS to progress this. Whilst we do not possess the extensive experience and resource of Public Health England, we are developing a natural language based algorithm for focused data extraction. NCRAS have a dedicated team to curating clinical data and the gold standard remains the NCRAS curated tables. However, for this dataset to improve and move forward, Genomics England are keen for feedback and for you to highlight areas for improvement.

You will note subtle differences to the structure of the table compared to the curated NCRAS tables and thus additional data dictionaries have been provided. Genomics England hopes to continue developing this uncurated live dataset with feedback and look forward to hearing your thoughts. Please reach out to us with related thoughts and suggestions via the Genomics England Service Desk, including "cancer_specific_datasets_pilot" in the title of your enquiry.

With the addition of the new pathology_reports dataset introduced in v16, the aml_path_reports and testes_path_reports datasets have been deprecated in v17.

Dataset type
Health and disease
Dataset sub-type
Not applicable

Keywords

genome, DNA, genomics, Public Health England, Cancer

Observations

Observed Node
Disambiguating Description
Measured Value
Measured Property
Observation Date

Persons

Cancer Tumour - Number of genomes

Not reported

Count

01 Jan 1970

Persons

Rare Disease Participants

Not reported

Count

01 Jan 1970

Persons

Cancer Germline - Number of genomes

Not reported

Count

01 Jan 1970

Persons

Cancer Participants

Not reported

Count

01 Jan 1970

Persons

Rare Disease - Number of genomes

Not reported

Count

01 Jan 1970

Provenance

Image contrast
Not stated
Biological sample availability
None/not available

Structural Metadata

Details

Publishing frequency
Quarterly
Version
17.0.0
Modified

08/10/2024

Distribution release date

30/03/2023

Citation Requirements
Genomics England

Coverage

Start date

04/09/2008

End date

30/12/2022

Time lag
Other
Maximum age range
152

Accessibility

Language
en
Controlled vocabulary
LOCAL
Format
Multiple Formats Available

Data Access Request

Dataset pipeline status
Not available
Time to dataset access
2-6 months
Access request cost
Fees will be dependent on the type of access that is necessary. Raw data is not eligible for export. Summary-level data may be exported provided that it is approved through the Genomics England Airlock Process
Access service description

More information about the Genomics England Research Environment can be found here:

https://www.genomicsengland.co.uk/about-genomics-england/research-environment/

https://research-help.genomicsengland.co.uk/display/GERE/1.+The+Genomics+England+Research+Environment

Genomics England 100k participants have consented to longitudinal lifetime followup and recontact safely through our clinical network. BRST (Bioinformatics Research Services) are a team of bioinformatics who know the dataset inside out and provide consultancy projects on a case by case basis. Our network of clinical and medical experts can be made available on case by case basis. Researchers have the opportunity to work with our and access the GeCIP network who are a community of world-leading experts in specific cancers and rare diseases.

Jurisdiction
GB-GBN
Data Controller
PUBLIC HEALTH ENGLAND
Data Processor
GENOMICS ENGLAND

Dataset Types: Health and disease


Collection Sources: