BLOCCS: Block Sparse Canonical Correlation Analysis With Application To Interpretable Omics Integration
Abstract
We introduce Block Sparse Canonical Correlation Analysis which estimates multiple pairs of canonical directions (together a "block") at once, resulting in significantly improved orthogonality of the sparse directions which, we demonstrate, translates to more interpretable solutions. Our approach builds on the sparse CCA method of (Solari, Brown, and Bickel 2019) in that we also express the bi-convex objective of our block formulation as a concave minimization problem over an orthogonal k-frame in a unit Euclidean ball, which in turn, due to concavity of the objective, is shrunk to a Stiefel manifold, which is optimized via gradient descent algorithm. Our simulations show that our method outperforms existing sCCA algorithms and implementations in terms of computational cost and stability, mainly due to the drastic shrinkage of our search space, and the correlation within and orthogonality between pairs of estimated canonical covariates. Finally, we apply our method, available as an R-package called BLOCCS, to multi-omic data on Lung Squamous Cell Carcinoma(LUSC) obtained via The Cancer Genome Atlas, and demonstrate its capability in capturing meaningful biological associations relevant to the hypothesis under study rather than spurious dominant variations.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2019
- DOI:
- 10.48550/arXiv.1909.07944
- arXiv:
- arXiv:1909.07944
- Bibcode:
- 2019arXiv190907944S
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Machine Learning
- E-Print:
- 8 pages