A Comparative Study of Machine Learning Methods for Verbal Autopsy Text Classification

doi:10.48550/arXiv.1402.4380

A Comparative Study of Machine Learning Methods for Verbal Autopsy Text Classification

A Verbal Autopsy is the record of an interview about the circumstances of an uncertified death. In developing countries, if a death occurs away from health facilities, a field-worker interviews a relative of the deceased about the circumstances of the death; this Verbal Autopsy can be reviewed off-site. We report on a comparative study of the processes involved in Text Classification applied to classifying Cause of Death: feature value representation; machine learning classification algorithms; and feature reduction strategies in order to identify the suitable approaches applicable to the classification of Verbal Autopsy text. We demonstrate that normalised term frequency and the standard TFiDF achieve comparable performance across a number of classifiers. The results also show Support Vector Machine is superior to other classification algorithms employed in this research. Finally, we demonstrate the effectiveness of employing a "locally-semi-supervised" feature reduction strategy in order to increase performance accuracy.

Publication:

arXiv e-prints

Pub Date:

February 2014

DOI:

10.48550/arXiv.1402.4380

arXiv:

arXiv:1402.4380

Bibcode:

2014arXiv1402.4380D

Keywords:

Computer Science - Computation and Language

E-Print:

10 pages

NASA/ADS

A Comparative Study of Machine Learning Methods for Verbal Autopsy Text Classification

Abstract