Bookmarks
Environmental determinants of health; linked health and DEFRA air quality data
Population Size
1,914,540
People
Years
2020 - 2023
Associated BioSamples
None/not available
Geographic coverage
United Kingdom
England
Lead time
1-2 months
Summary
Documentation
A highly granular large dataset of 10,908,440 admissions, curated by PIONEER to look at matching DEFRA’s air pollution data to the patients registered address. The data includes demography, admission details, diagnostic codes (ICD-10 & SNOMED-CT), respiratory data, medications, presenting complaints all linked to DEFRA air quality. This dataset offers an exceptional resource for researchers seeking to understand the short- and long-term impacts of air quality on health outcomes. This dataset synergises DEFRA air pollution data with anonymised health records to offer an opportunity for multidisciplinary research in environmental health, epidemiology and beyond. The current dataset includes admissions from 01-01-2000 to 31-08-2023 but can be expanded to assess other timelines of interest.
Geography: The West Midlands (WM) has a population of 6 million & includes a diverse ethnic & socio-economic mix. UHB is one of the largest NHS Trusts in England, providing direct acute services & specialist care across four hospital sites, with 2.2 million patient episodes per year, 2750 beds & > 120 ITU bed capacity. UHB runs a fully electronic healthcare record (EHR) (PICS; Birmingham Systems), a shared primary & secondary care record (Your Care Connected) & a patient portal “My Health”.
DEFRA
Air quality data has been extracted from sampling stations in the Birmingham area, hourly rates of volatile and non-volatile particulates, hydrocarbons, sulphur dioxide, ozone, carbon monoxide and nitrogen oxides. Each station covers a subset of the pollutants, so this may not necessarily be the full set. © Crown 2024 copyright Defra via uk-air.defra.gov.uk, licenced under the Open Government Licence (OGL).
Data set availability: Data access is available via the PIONEER Hub for projects which will benefit the public or patients. This can be by developing a new understanding of disease, by providing insights into how to improve care, or by developing new models, tools, treatments, or care processes. Data access can be provided to NHS, academic, commercial, policy and third sector organisations. Applications from SMEs are welcome. There is a single data access process, with public oversight provided by our public review committee, the Data Trust Committee. Contact pioneer@uhb.nhs.uk or visit www.pioneerdatahub.co.uk for more details.
Available supplementary data: Matched controls; ambulance and community data. Unstructured data (images). We can build synthetic data to meet bespoke requirements.
Available supplementary support: Analytics, model build, validation & refinement; A.I. support. Data partner support for ETL (extract, transform & load) processes. Bespoke and “off the shelf” Trusted Research Environment (TRE) build and run. Consultancy with clinical, patient & end-user and purchaser access/ support. Support for regulatory requirements. Cohort discovery. Data-driven trials and “fast screen” services to assess population size.
Dataset type
Dataset sub-type
Dataset population size
Keywords
Observations
Observed Node | Disambiguating Description | Measured Value | Measured Property | Observation Date |
---|---|---|---|---|
Persons | 1914540 patients from Jan 2000 to Sep 2023 | 1914540 | Count | 02 Jan 2024 |
Provenance
Purpose of dataset collection
Source of data extraction
Collection source setting
Patient pathway description
Image contrast
Biological sample availability
Structural Metadata
Details
Publishing frequency
Version
Modified
08/10/2024
Distribution release date
19/01/2024
Citation Requirements
Coverage
Start date
01/01/2020
End date
31/08/2023
Time lag
Geographic coverage
Maximum age range
Follow-up
Accessibility
Language
Alignment with standardised data models
Controlled vocabulary
Format
Data Access Request
Dataset pipeline status
Time to dataset access
Access request cost
Access method category
Access service description
Trusted Research Environments (TRE) are built using Microsoft Azure services and hosted in the UK to provide research teams a safe, secure and agile environment which allows users to quickly analyse, interpret and form an enriched view of primary care information through a range of integrated datasets.
Health data collated from multiple sources is ingested into a secure data lake which will then allow subsets of data to be made available to research teams on approval of a data request. Once approved a customer specific TRE is made available with a standard set of leading analytical tools from Microsoft including Azure Databricks, Azure Machine Learning, Azure SQL and Azure Synapse (for large-scale data warehouses). Specific tools can be provided at an additional cost over the standard platform data access charge and the PIONEER team will work with you to determine your exact needs.
Access to the TRE is managed using the latest virtual desktop technology to provide a safe and secure end-user experience. By utilising leading edge design PIONEER are able to create TREs rapidly to enable us to service any customer requirement.
Jurisdiction
Data use limitation
Data use requirements
Data Controller