EnhancerPred: a predictor for discovering enhancers based on the combination and selection of multiple features
Abstract
Enhancers are cis elements that play an important role in regulating gene expression by enhancing it. Recent study of modifications revealed that enhancers are a large group of functional elements with many different subgroups, which have different biological activities and regulatory effects on target genes. As powerful auxiliary tools, several computational methods have been proposed to distinguish enhancers from other regulatory elements, but only one method has been considered to clustering them into subgroups. In this study, we developed a predictor (called EnhancerPred) to distinguish between enhancers and nonenhancers and to determine enhancers’ strength. A two-step wrapper-based feature selection method was applied in high dimension feature vector from bi-profile Bayes and pseudo-nucleotide composition. Finally, the combination of 104 features from bi-profile Bayes, 1 feature from nucleotide composition and 9 features from pseudo-nucleotide composition yielded the best performance for identifying enhancers and nonenhancers, with overall Acc of 77.39%. The combination of 89 features from bi-profile Bayes and 10 features from pseudo-nucleotide composition yielded the best performance for identifying strong and weak enhancers, with overall Acc of 68.19%. The process and steps of feature optimization illustrated that it is necessary to construct a particular model for identifying strong enhancers and weak enhancers.
- Publication:
-
Scientific Reports
- Pub Date:
- December 2016
- DOI:
- 10.1038/srep38741
- Bibcode:
- 2016NatSR...638741J