English

Efficient Reconstruction of Stochastic Pedigrees

Data Structures and Algorithms 2020-05-11 v1 Machine Learning Populations and Evolution Quantitative Methods Machine Learning

Abstract

We introduce a new algorithm called {\sc Rec-Gen} for reconstructing the genealogy or \textit{pedigree} of an extant population purely from its genetic data. We justify our approach by giving a mathematical proof of the effectiveness of {\sc Rec-Gen} when applied to pedigrees from an idealized generative model that replicates some of the features of real-world pedigrees. Our algorithm is iterative and provides an accurate reconstruction of a large fraction of the pedigree while having relatively low \emph{sample complexity}, measured in terms of the length of the genetic sequences of the population. We propose our approach as a prototype for further investigation of the pedigree reconstruction problem toward the goal of applications to real-world examples. As such, our results have some conceptual bearing on the increasingly important issue of genomic privacy.

Keywords

Cite

@article{arxiv.2005.03810,
  title  = {Efficient Reconstruction of Stochastic Pedigrees},
  author = {Younhun Kim and Elchanan Mossel and Govind Ramnarayan and Paxton Turner},
  journal= {arXiv preprint arXiv:2005.03810},
  year   = {2020}
}