Using Cohort Discovery
Using Cohort Discovery
Documentation
1. What is Cohort Discovery?
Cohort Discovery is a service on the Health Data Research (HDR) Gateway that lets approved users rapidly determine the size of the population cohort that matches their research question within different datasets from across the UK, without having to directly contact each individual organisations that hold the data. It combines a natural-language query assistant with a step-by-step visual query builder, making it easier to scope a cohort before submitting a formal data access request.
Users can specify defined characteristics relevant to their proposed analysis (e.g. the number of female asthmatics under the age of 35) through the Cohort Discovery user interface. These search terms are then sent as a real-time query to multiple pseudonymised datasets across multiple Data Custodians, with results returned in the form of a numerical count of individuals that meet those specific criteria. Researchers can then understand whether a dataset contains a cohort (or group) of interest and if so, contact the Data Custodian to find out more or submit a Data Access Request.
|
Key benefits
|
2. Registering for Cohort Discovery Service
Before requesting access to Cohort Discovery please ensure you have a valid Health Data Research Gateway account. Click “Sign In” from Health Data Research Gateway to create an account or sign in.
|
Access request To request access to Cohort Discovery, once you have a valid Gateway account, go to https://healthdatagateway.org/en/about/cohort-discovery, click on ‘Access Cohort Discovery’ on the top-right of the page, and follow through to fill out the access request form. The Cohort Discovery Support team will provision your account and confirm when access is ready. |
IMPORTANT: If your access has recently been approved, you might need to sign out and sign back in to your Gateway account to refresh your permissions before continuing.
3. Access Cohort Discovery
-
Sign in to the Health Data Research Gateway Health Data Research Gateway .
-
On the Gateway home page, select the Cohort Discovery tile.
-
On the About Cohort Discovery page, click on the blue ‘Access Cohort Discovery’ button to open the service. On the next popup, click on the green ‘Access Cohort Service (Beta)’ button to access the new site.
-
You will enter the Cohort Discovery workspace, where you can start a new query or review previous results.

4. Build your first query
The New Query workspace is the starting point for all Cohort Discovery searches. You can use natural-language input to get started quickly, or build a structured query using the ‘Insert’ tools in the left panel.
4.1 Open the New Query workspace
-
Select the New Query tab at the top of the Cohort Discovery screen.
-
Enter a name for your query in the Query Name field.
-
In the natural-language search box, type a brief description of the cohort you are looking for, for example: adults with diabetes and metformin.
-
Cohort Discovery will interpret your input and generate a starting set of structured rules in the query canvas below.

4.2 Filter by collections
By default, your query will run across all available collections you have access to. To target specific datasets, use the Filter Collections panel on the right side of the New Query screen.
-
Select the Filter Collections button at the top right of the query workspace.
-
Check the collections you want to include.
-
You can choose to include or exclude Synthetic data collections using the ‘Synthetic Data Collections’ toggle. Synthetic data collections may be useful for testing and feature exploration.
|
What are collections? A collection is a dataset that has been onboarded to Cohort Discovery by a Data Custodian. Each collection is linked to a registered Gateway dataset and is made searchable once it has passed through the activation workflow. |
FAQs
Who can access Cohort Discovery?
Access is available to approved researchers from academia, public sector, industry, and international organisations. Your level of access may vary based on your organisation type and the permissions set by individual data custodians.
What data can I search using Cohort Discovery?
You can explore pseudonymised patient counts across datasets provided by participating data custodians. Each dataset reflects what has been made available for discovery based on the data partner's governance, format, and permissions.
Can I use Cohort Discovery to run live analyses or download data?
No. Cohort Discovery is for feasibility assessment only. It returns aggregate counts—not individual-level or downloadable data. If you wish to request access to the actual data, you must submit a formal request via the Gateway.
I’m based outside the UK—can I use Cohort Discovery?
Yes. Some data custodians allow access to international users, while others restrict to UK-based researchers only. The system automatically filters what you can access based on your user group.
Still can’t find what you’re looking for?
The quickest way to get your issue solved is through the links above, but if you aren’t able to find a solution then contact us here:
Contact support

