MONICA: Benchmarking on Long-tailed Medical Image Classification

doi:10.48550/arXiv.2410.02010

MONICA: Benchmarking on Long-tailed Medical Image Classification

Long-tailed learning is considered to be an extremely challenging problem in data imbalance learning. It aims to train well-generalized models from a large number of images that follow a long-tailed class distribution. In the medical field, many diagnostic imaging exams such as dermoscopy and chest radiography yield a long-tailed distribution of complex clinical findings. Recently, long-tailed learning in medical image analysis has garnered significant attention. However, the field currently lacks a unified, strictly formulated, and comprehensive benchmark, which often leads to unfair comparisons and inconclusive results. To help the community improve the evaluation and advance, we build a unified, well-structured codebase called Medical OpeN-source Long-taIled ClassifiCAtion (MONICA), which implements over 30 methods developed in relevant fields and evaluated on 12 long-tailed medical datasets covering 6 medical domains. Our work provides valuable practical guidance and insights for the field, offering detailed analysis and discussion on the effectiveness of individual components within the inbuilt state-of-the-art methodologies. We hope this codebase serves as a comprehensive and reproducible benchmark, encouraging further advancements in long-tailed medical image learning. The codebase is publicly available on https://github.com/PyJulie/MONICA.

Publication:

arXiv e-prints

Pub Date:

October 2024

DOI:

10.48550/arXiv.2410.02010

arXiv:

arXiv:2410.02010

Bibcode:

2024arXiv241002010J

Keywords:

Electrical Engineering and Systems Science - Image and Video Processing;
Computer Science - Computer Vision and Pattern Recognition

NASA/ADS

MONICA: Benchmarking on Long-tailed Medical Image Classification

Abstract