BioMM: Biologically-informed Multi-stage Machine learning for identification of epigenetic fingerprints
The identification of reproducible biological patterns from high-dimensional data is a bottleneck for understanding the biology of complex illnesses such as schizophrenia. To address this, we developed a biologically informed, multi-stage machine learning (BioMM) framework. BioMM incorporates biological pathway information to stratify and aggregate high-dimensional biological data. We demonstrate the utility of this method using genome-wide DNA methylation data and show that it substantially outperforms conventional machine learning approaches. Therefore, the BioMM framework may be a fruitful machine learning strategy in high-dimensional data and be the basis for future, integrative analysis approaches.