Multi-TGDR: A Regularization Method for Multi-Class Classification in Microarray Experiments
Abstract
Background With microarray technology becoming mature and popular, the selection and use of a small number of relevant genes for accurate classification of samples is a hot topic in the circles of biostatistics and bioinformatics. However, most of the developed algorithms lack the ability to handle multiple classes, which arguably a common application. Here, we propose an extension to an existing regularization algorithm called Threshold Gradient Descent Regularization (TGDR) to specifically tackle multi-class classification of microarray data. When there are several microarray experiments addressing the same/similar objectives, one option is to use meta-analysis version of TGDR (Meta-TGDR), which considers the classification task as combination of classifiers with the same structure/model while allowing the parameters to vary across studies. However, the original Meta-TGDR extension did not offer a solution to the prediction on independent samples. Here, we propose an explicit method to estimate the overall coefficients of the biomarkers selected by Meta-TGDR. This extension permits broader applicability and allows a comparison between the predictive performance of Meta-TGDR and TGDR using an independent testing set. Results Using real-world applications, we demonstrated the proposed multi-TGDR framework works well and the number of selected genes is less than the sum of all individualized binary TGDRs. Additionally, Meta-TGDR and TGDR on the batch-effect adjusted pooled data approximately provided same results. By adding Bagging procedure in each application, the stability and good predictive performance are warranted. Conclusions Compared with Meta-TGDR, TGDR is less computing time intensive, and requires no samples of all classes in each study. On the adjusted data, it has approximate same predictive performance with Meta-TGDR. Thus, it is highly recommended.
- Publication:
-
PLoS ONE
- Pub Date:
- November 2013
- DOI:
- 10.1371/journal.pone.0078302
- arXiv:
- arXiv:1307.5576
- Bibcode:
- 2013PLoSO...878302T
- Keywords:
-
- Statistics - Methodology
- E-Print:
- doi:10.1371/journal.pone.0078302