HDR Gateway logo
HDR Gateway logo

Bookmarks

OPTIMAM Mammographic Image Database

Population Size

540,000

People

Years

2011

Associated BioSamples

None/not available

Geographic coverage

United Kingdom

Lead time

1-2 months

Summary

The OPTIMAM Mammography Image Database is a sharable resource with processed and unprocessed mammography images from United Kingdom breast screening centers, with annotated cancers and clinical details.

Documentation

The development of artificial intelligence software to improve the outcomes of breast screening relies on the availability of well-curated image databases. The OPTIMAM Mammography Image Database (OMI-DB) was created to provide a centralized, fully annotated dataset for research. The initial reason for creating the database was for the Cancer Research United Kingdom–funded projects OPTIMAM (2008–2013) and OPTIMAM2 (2013–2018), which evaluated how various factors affect breast cancer detection on mammograms. The images are derived from screening centers in the United Kingdom and combined with systematically collected data on the current screening episode, as well as previous and subsequent episodes. In the United Kingdom, the National Health Service Breast Screening Programme (NHSBSP) invites women to attend breast screening every 3 years between the ages of 50 and 70 years. A screening episode is one attendance at screening by a woman and includes any immediate workup imaging (assessment) if she was recalled for further investigation of a suspicious region on the screening mammograms. Any pathologic finding is also included, and the episode ends with histologic diagnosis or treatment for all lesions. Our objective was to collect mammograms for women with screen-detected cancers as well as representative samples of normal and benign screening cases.

“For processing” and “for presentation” screening mammograms and prior mammograms have been collected for all screen-detected and interval cancers from several screening centres since 2011. All mammography images and data associated with initial screening attendance, further assessment, and surgical outcomes were collected as a screening episode. In addition to continuous collection of cancers, images and clinical data were collected for all women screened during 2014, and for a random selection of 25% of all women screened in 2012, 2013, and 2015 at two of the three sites. Collection into the database is ongoing, and each case is updated with new information and further screening episodes.

The associated data comprise radiologic, clinical, and pathologic information extracted from NBSS. Information on screening history, previous occurrences of cancer, biopsy results, and surgical procedures are collected from NBSS. The exact radiologic locations of lesions are not stored in NBSS. However, such information, important for training and evaluating algorithms, is collected in OMI-DB. Experienced (UK accredited) mammography readers at their own site (radiologists and advanced practice radiographers) annotate the images with reference to records made at the time of initial mammography interpretation and at further (assessment) workup (magnification views, US, and biopsy). This information is used to define rectangular regions of interest indicating the location and area of lesions and other attributes, such as radiologic appearance and conspicuity.

Dataset type
Health and disease
Dataset sub-type
Not applicable
Dataset population size
540000

Keywords

Mammography, Tomosythesis, Breast Screening, Symptomatic, Breast Cancer

Observations

Observed Node
Disambiguating Description
Measured Value
Measured Property
Observation Date

Persons

Total number of clients in the dataset

540000

Total Count

26 May 2023

Findings

Screen detected cancer findings

18000

Count of screen detect cancer findings

26 May 2023

Events

Interval Cancer events

3500

Number of interval cancer events

26 May 2023

Provenance

Purpose of dataset collection
Disease registry
Source of data extraction
Other
Collection source setting
Other
Patient pathway description
The images are derived from screening centres in the United Kingdom and combined with systematically collected data on the current screening episode, as well as previous and subsequent episodes. In the United Kingdom, the National Health Service Breast Screening Programme (NHSBSP) invites women to attend breast screening every 3 years between the ages of 50 and 70 years. A screening episode is one attendance at screening by a woman and includes any immediate workup imaging (assessment) if she was recalled for further investigation of a suspicious region on the screening mammograms. Any pathologic finding is also included, and the episode ends with histologic diagnosis or treatment for all lesions. At some screening centers younger and older women are also invited for screening as part of the national age trial. Some women in high-risk groups receive annual invitations to screening. Our objective was to collect mammograms for women with screen-detected cancers as well as representative samples of normal and benign screening cases.
Image contrast
Not stated
Biological sample availability
None/not available

Structural Metadata

Details

Publishing frequency
Continuous
Version
2.0.0
Modified

08/10/2024

Citation Requirements
Cancer Research UK;,;Royal Surrey NHS Foundation Trust

Coverage

Start date

01/01/2011

Time lag
Variable
Geographic coverage
United Kingdom
Minimum age range
50
Maximum age range
70
Follow-up
Continuous

Accessibility

Language
en
Controlled vocabulary
LOCAL
Format
DICOM, JSON

Data Access Request

Dataset pipeline status
Not available
Time to dataset access
1-2 months
Jurisdiction
GB-ENG
Data use limitation
Project-specific restrictions
Data use requirements
Project-specific restrictions
Data Controller
CRUK and Royal Surrey NHS Foundation Trusts are joint data controllers. The OPTIMAM Data Access Committee (https://medphys.royalsurrey.nhs.uk/omidb/the-steering-committee/) administer and review data access request. Please apply for access at https://medphys.royalsurrey.nhs.uk/omidb/apply-for-access/ .
Data Processor
Data scientists from the Royal Surrey manage the collection, storage and distribution of the dataset

Dataset Types: Health and disease


Collection Sources: Other