Principled and interpretable alignability testing and integration of single-cell data
Abstract
Aligning and integrating different datasets is a key challenge in single-cell research. However, existing methods suffer from several fundamental and under-appreciated limitations. First, we do not have a rigorous statistical test for determining whether two single-cell datasets should even be integrated. Moreover, popular methods substantially distort the data during alignment, making the downstream analysis subject to bias and difficult to interpret. We address both challenges with a unified spectral manifold alignment and inference (SMAI) framework. SMAI is a flexible and interpretable method for aligning datasets with the same type of features, equipped with an alignability test justified by statistical theory. It preserves within-data structures and improves downstream analyses, such as identification of differentially expressed genes and imputation of spatial transcriptomics.
- Publication:
-
Proceedings of the National Academy of Science
- Pub Date:
- February 2024
- DOI:
- arXiv:
- arXiv:2308.01839
- Bibcode:
- 2024PNAS..12113719M
- Keywords:
-
- Quantitative Biology - Quantitative Methods;
- Computer Science - Computer Vision and Pattern Recognition;
- Quantitative Biology - Genomics;
- Statistics - Applications;
- Statistics - Machine Learning
- E-Print:
- Proceedings of the National Academy of Sciences, 2024, 121(10) e2313719121