Efficient Reconstruction of Stochastic Pedigrees
Abstract
We introduce a new algorithm called {\sc Rec-Gen} for reconstructing the genealogy or \textit{pedigree} of an extant population purely from its genetic data. We justify our approach by giving a mathematical proof of the effectiveness of {\sc Rec-Gen} when applied to pedigrees from an idealized generative model that replicates some of the features of real-world pedigrees. Our algorithm is iterative and provides an accurate reconstruction of a large fraction of the pedigree while having relatively low \emph{sample complexity}, measured in terms of the length of the genetic sequences of the population. We propose our approach as a prototype for further investigation of the pedigree reconstruction problem toward the goal of applications to real-world examples. As such, our results have some conceptual bearing on the increasingly important issue of genomic privacy.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2020
- DOI:
- 10.48550/arXiv.2005.03810
- arXiv:
- arXiv:2005.03810
- Bibcode:
- 2020arXiv200503810K
- Keywords:
-
- Computer Science - Data Structures and Algorithms;
- Computer Science - Machine Learning;
- Quantitative Biology - Populations and Evolution;
- Quantitative Biology - Quantitative Methods;
- Statistics - Machine Learning