REVIEW

A global review of publicly available datasets for ophthalmological imaging: barriers to access, usability, and generalisability


Dr Saad Khan

The availability of health datasets has accelerated digital health research. Ophthalmology has been one of the leading areas of innovation, where several public datasets for ophthalmic imaging have been use in machine learning research. Datasets are a critical component for machine learning algorithm development, hence these need careful scrutiny prior to use. Prior to our review, it was previously unknown how many ophthalmic datasets existed, their degree of accessibility and what comprised them. Therefore, we undertook a global review of all publicly available ophthalmic imaging datasets, to create a central directory, detail their accessibility, describe which diseases and populations are represented, and report on the completeness of associated metadata.



What did we find?



Why do these findings matter?



How can we address the gaps?



Bar chart showing number of datasets associated with publication date. World map showing geographical distribution of datasets.
Bar chart showing diseases represented by datasets. Bar chart showing imaging modalities represented across datasets.

Figure 2 from the paper: Information associated with the publication date (A), geographical distribution (B), represented diseases (C), and image types (D) of the study datasets


Read the paper here



Source: Khan S, Liu X, Nath S, Korot E, Faes L, Wagner S et al. A global review of publicly available datasets for ophthalmological imaging: barriers to access, usability, and generalisability. The Lancet Digital Health. 2021;3(1):e51-e66.