Radiomic feature selection for lung cancer classifiers
Abstract
Machine learning methods with quantitative imaging features integration have recently gained a lot of attention for lung nodule classification. However, there is a dearth of studies in the literature on effective features ranking methods for classification purpose. Moreover, optimal number of features required for the classification task also needs to be evaluated. In this study, we investigate the impact of supervised and unsupervised feature selection techniques on machine learning methods for nodule classification in Computed Tomography (CT) images. The research work explores the classification performance of Naive Bayes and Support Vector Machine(SVM) when trained with 2, 4, 8, 12, 16 and 20 highly ranked features from supervised and unsupervised ranking approaches. The best classification results were achieved using SVM trained with 8 radiomic features selected from supervised feature ranking methods and the accuracy was 100%. The study further revealed that very good nodule classification can be achieved by training any of the SVM or Naive Bayes with a fewer radiomic features. A periodic increment in the number of radiomic features from 2 to 20 did not improve the classification results whether the selection was made using supervised or unsupervised ranking approaches.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2020
- DOI:
- arXiv:
- arXiv:2003.07098
- Bibcode:
- 2020arXiv200307098S
- Keywords:
-
- Electrical Engineering and Systems Science - Image and Video Processing;
- Computer Science - Computer Vision and Pattern Recognition
- E-Print:
- Journal of Intelligent &