Public consultation on our draft dataset standards

To build artificial intelligence (AI) healthcare technologies which benefit all patients, we need datasets which represent the diverse range of people for whom they are intended to be used.

Unfortunately, health datasets often do not adequately represent minoritised populations, who are subsequently less able to benefit from technologies based on these datasets - a phenomenon sometimes referred to as ‘health data poverty’.1

There is increasing recognition that AI trained on inadequately diverse datasets can cause real harm to those who are not appropriately represented. This may be because they are not represented at all, because they are represented inaccurately, or other forms of bias within the dataset. Getting the data right is a critical part of enabling AI healthcare technologies to perform equitably across the diversity of the people they are intended to serve. 

STANDING Together aims to ensure that inclusivity and diversity are considered when developing health datasets, by creating recommendations for healthcare dataset documentation and use.  

This means that AI healthcare technologies can be built and tested using data which adequately represent the people they are intended to serve. Our green paper provides a critical update on these recommendations, and an opportunity for anyone from any community to contribute. These recommendations have been compiled through an 18-month programme of systematic reviews, surveys, in-depth interviews, and a modified Delphi study. They have been informed by the views of over 250 stakeholders across 32 countries, an international advisory group and a patient and public committee.

It is essential that the recommendations are ‘sense-checked’ by as many people as possible, who can bring their diverse experiences and knowledge to challenge and improve them. 

There might be things that have been missed out, are unclear, or that you disagree with. This consultation also provides an opportunity for individuals and organisations who work with data to start planning for how they could adopt these recommendations.

STANDING Together is building STANdards for data Diversity, INclusivity, & Generalisablity. Established in 2021 as part of the NHS AI Lab’s AI Ethics initiative, it is a partnership between over 30 academic, regulatory, policy, industry, and charitable organisations worldwide. STANDING Together is funded by the NHS AI Lab at the NHS Transformation Directorate and The Health Foundation and managed by the National Institute for Health and Care Research (AI_HI200014).