SIMLR: A Tool for Large-Scale Genomic Analyses by Multi-Kernel Learning
Abstract
We here present SIMLR (Single-cell Interpretation via Multi-kernel LeaRning), an open-source tool that implements a novel framework to learn a sample-to-sample similarity measure from expression data observed for heterogenous samples. SIMLR can be effectively used to perform tasks such as dimension reduction, clustering, and visualization of heterogeneous populations of samples. SIMLR was benchmarked against state-of-the-art methods for these three tasks on several public datasets, showing it to be scalable and capable of greatly improving clustering performance, as well as providing valuable insights by making the data more interpretable via better a visualization. Availability and Implementation SIMLR is available on GitHub in both R and MATLAB implementations. Furthermore, it is also available as an R package on http://bioconductor.org.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2017
- DOI:
- 10.48550/arXiv.1703.07844
- arXiv:
- arXiv:1703.07844
- Bibcode:
- 2017arXiv170307844W
- Keywords:
-
- Quantitative Biology - Genomics;
- Computer Science - Machine Learning;
- Quantitative Biology - Quantitative Methods
- E-Print:
- doi:10.1002/pmic.201700232