Topological Data Analysis of copy number alterations in cancer
Abstract
Identifying subgroups and properties of cancer biopsy samples is a crucial step towards obtaining precise diagnoses and being able to perform personalized treatment of cancer patients. Recent data collections provide a comprehensive characterization of cancer cell data, including genetic data on copy number alterations (CNAs). We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach that encodes each cancer sample as a persistence diagram of topological features, i.e., high-dimensional voids represented in the data. We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data and demonstrate the viability of some applications on finding substructures in cancer data as well as comparing similarity of cancer types.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2020
- DOI:
- 10.48550/arXiv.2011.11070
- arXiv:
- arXiv:2011.11070
- Bibcode:
- 2020arXiv201111070G
- Keywords:
-
- Quantitative Biology - Genomics;
- Computer Science - Machine Learning