Bookmarks

White Swan UK Oncology Online Patient & Public Conversations Dataset

Population Size

118,984

People

Population Size statistic card

Years

2023

Years statistic card

Associated BioSamples

None/not available

Associated BioSamples statistic card

Geographic coverage

United Kingdom

Geographic coverage statistic card

Lead time

1-2 months

Lead time statistic card

Summary

The dataset contains anonymised patient and public conversation which has taken place online regarding over 50 cancer types (This includes cancers most commonly experienced and rarer types)

Documentation

The dataset contains anonymised patient and public conversation which has taken place online regarding over 50 cancer types (This includes cancers most commonly experienced and rarer types).

The curation of the dataset is based on specific cancer types and cancer patient forums. It is not based on every social post about cancer within the online sources, which is often irrelevant to the patient experience.

Dataset type

Health and disease, Treatments/Interventions, Measurements/Tests, Imaging types, Socioeconomic, Lifestyle

Dataset sub-type

Cancer

Dataset population size

118984

Keywords

Observations

Observed Node

Disambiguating Description

Measured Value

Measured Property

Observation Date

Persons

Persons in this dataset are determined by the unique volume of chosen display names in the data. This is calculated per source (reddit, reviews, other forums), and then totaled together. In other forums and reviews domains persons may choose to denote themselves as anonymous. In this case, anonymous users are counted once per domain. For example, on 'https://healthunlocked.com/lungcancer'.

118984

Unique online names indicating number of persons

16 Apr 2025

Provenance

Purpose of dataset collection

Research cohort

Source of data extraction

Free text NLP

Collection source setting

Other

Image contrast

Not stated

Biological sample availability

None/not available

Structural Metadata

Details

Publishing frequency

Irregular

Version

1.0.0

Modified

07/04/2025

Citation Requirements

White Swan is a registered charity in England and Wales (1176486) improving health and wellbeing through AI technology and analytics.

Coverage

Start date

01/03/2023

Time lag

1-2 months

Geographic coverage

United Kingdom

Maximum age range

112

Follow-up

Other

Accessibility

Language

en

Alignment with standardised data models

OTHER, LOCAL

Controlled vocabulary

LOCAL, OTHER, HPO

Format

csv, xlsx, web page explorer

Data Access Request

Dataset pipeline status

Available

Access rights

In Progress

Time to dataset access

1-2 months

Access request cost

On Request

Access method category

Varies based on project

Access service description

On Request

Jurisdiction

UK

Data use limitation

Project-specific restrictions

Data use requirements

Project-specific restrictions

Data Controller

White Swan

Data Processor

White Swan

Dataset Types: Health and disease, Treatments/Interventions, Measurements/Tests, Imaging types, Socioeconomic, Lifestyle

Dataset Sub-types: Cancer


Collection Sources: Other

end of page