Structural Learning and Integrative Decomposition of Multi-View Data
Abstract
The increased availability of the multi-view data (data on the same samples from multiple sources) has led to strong interest in models based on low-rank matrix factorizations. These models represent each data view via shared and individual components, and have been successfully applied for exploratory dimension reduction, association analysis between the views, and further learning tasks such as consensus clustering. Despite these advances, there remain significant challenges in modeling partially-shared components, and identifying the number of components of each type (shared/partially-shared/individual). In this work, we formulate a novel linked component model that directly incorporates partially-shared structures. We call this model SLIDE for Structural Learning and Integrative DEcomposition of multi-view data. We prove the existence of SLIDE decomposition and explicitly characterize the identifiability conditions. The proposed model fitting and selection techniques allow for joint identification of the number of components of each type, in contrast to existing sequential approaches. In our empirical studies, SLIDE demonstrates excellent performance in both signal estimation and component selection. We further illustrate the methodology on the breast cancer data from The Cancer Genome Atlas repository.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2017
- DOI:
- 10.48550/arXiv.1707.06573
- arXiv:
- arXiv:1707.06573
- Bibcode:
- 2017arXiv170706573G
- Keywords:
-
- Statistics - Machine Learning;
- Statistics - Methodology
- E-Print:
- Biometrics 2019, Vol. 75, No. 4, 1121-1132