essHi-C: Essential component analysis of Hi-C matrices
Abstract
Motivation: Hi-C matrices are cornerstones for qualitative and quantitative studies of genome folding, from its territorial organization to compartments and topological domains. The high dynamic range of genomic distances probed in Hi-C assays reflects in an inherent stochastic background of the interactions matrices, which inevitably convolve the features of interest with largely aspecific ones. Results: Here we introduce a discuss essHi-C, a method to isolate the specific, or essential component of Hi-C matrices from the aspecific portion of the spectrum that is compatible with random matrices. Systematic comparisons show that essHi-C improves the clarity of the interaction patterns, enhances the robustness against sequencing depth, allows the unsupervised clustering of experiments in different cell lines and recovers the cell-cycle phasing of single-cells based on Hi-C data. Thus, essHi-C provides means for isolating significant biological and physical features from Hi-C matrices.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2021
- DOI:
- 10.48550/arXiv.2101.10645
- arXiv:
- arXiv:2101.10645
- Bibcode:
- 2021arXiv210110645F
- Keywords:
-
- Quantitative Biology - Genomics;
- Quantitative Biology - Biomolecules;
- Statistics - Applications
- E-Print:
- 14 pages, 4 figures. This is the Authors' Original Version of the article, which has been accepted for publication in Bioinformatics published by Oxford University Press