Synthetic COVID-19 Chest X-ray Dataset for Computer-Aided Diagnosis
Abstract
We introduce a new dataset called Synthetic COVID-19 Chest X-ray Dataset for training machine learning models. The dataset consists of 21,295 synthetic COVID-19 chest X-ray images to be used for computer-aided diagnosis. These images, generated via an unsupervised domain adaptation approach, are of high quality. We find that the synthetic images not only improve performance of various deep learning architectures when used as additional training data under heavy imbalance conditions, but also detect the target class with high confidence. We also find that comparable performance can also be achieved when trained only on synthetic images. Further, salient features of the synthetic COVID-19 images indicate that the distribution is significantly different from Non-COVID-19 classes, enabling a proper decision boundary. We hope the availability of such high fidelity chest X-ray images of COVID-19 will encourage advances in the development of diagnostic and/or management tools.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2021
- DOI:
- 10.48550/arXiv.2106.09759
- arXiv:
- arXiv:2106.09759
- Bibcode:
- 2021arXiv210609759Z
- Keywords:
-
- Electrical Engineering and Systems Science - Image and Video Processing;
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Machine Learning