Ensemble of classifiers for speech evaluation

doi:10.48550/arXiv.2501.00067

Ensemble of classifiers for speech evaluation

The article describes an attempt to apply an ensemble of binary classifiers to solve the problem of speech assessment in medicine. A dataset was compiled based on quantitative and expert assessments of syllable pronunciation quality. Quantitative assessments of 7 selected metrics were used as features: dynamic time warp distance, Minkowski distance, correlation coefficient, longest common subsequence (LCSS), edit distance of real se-quence (EDR), edit distance with real penalty (ERP), and merge split (MSM). Expert as-sessment of pronunciation quality was used as a class label: class 1 means high-quality speech, class 0 means distorted. A comparison of training results was carried out for five classification methods: logistic regression (LR), support vector machine (SVM), naive Bayes (NB), decision trees (DT), and K-nearest neighbors (KNN). The results of using the mixture method to build an ensemble of classifiers are also presented. The use of an en-semble for the studied data sets allowed us to slightly increase the classification accuracy compared to the use of individual binary classifiers.

Publication:

arXiv e-prints

Pub Date:

December 2024

DOI:

10.48550/arXiv.2501.00067

arXiv:

arXiv:2501.00067

Bibcode:

2025arXiv250100067B

Keywords:

Computer Science - Sound;
Computer Science - Artificial Intelligence;
Electrical Engineering and Systems Science - Audio and Speech Processing

ADS

Ensemble of classifiers for speech evaluation

Abstract