Related papers: Perfect Phylogeny Haplotyping is Complete for Logs…
In this paper we present a collection of results pertaining to haplotyping. The first set of results concerns the combinatorial problem of reconstructing haplotypes from incomplete and/or imperfectly sequenced haplotype data. More…
The perfect phylogeny is one of the most used models in different areas of computational biology. In this paper we consider the problem of the Persistent Perfect Phylogeny (referred as P-PP) recently introduced to extend the perfect…
The problem Parsimony Haplotyping (PH) asks for the smallest set of haplotypes which can explain a given set of genotypes, and the problem Minimum Perfect Phylogeny Haplotyping (MPPH) asks for the smallest such set which also allows the…
Haplotype phasing, the process of resolving parental allele inheritance patterns in diploid genomes, is critical for precision medicine and population genetics, yet the underlying optimization is NP-hard, posing a scalability challenge. To…
The computational problem of inferring the full haplotype of a cell starting from read sequencing data is known as haplotype assembly, and consists in assigning all heterozygous Single Nucleotide Polymorphisms (SNPs) to exactly one of the…
The binary perfect phylogeny model is too restrictive to model biological events such as back mutations. In this paper we consider a natural generalization of the model that allows a special type of back mutation. We investigate the problem…
A looming question that must be solved before robotic plant phenotyping capabilities can have significant impact to crop improvement programs is scalability. High Throughput Phenotyping (HTP) uses robotic technologies to analyze crops in…
Computing haplotypes from sequencing data, i.e. haplotype assembly, is an important component of molecular and population genetics problems, including interpreting the effects of genetic variation on complex traits and reconstructing…
The Persistent Perfect phylogeny, also known as Dollo-1, has been introduced as a generalization of the well-known perfect phylogenetic model for binary characters to deal with the potential loss of characters. The problem of deciding the…
Reconstructing the evolutionary history of a set of species is a central task in computational biology. In real data, it is often the case that some information is missing: the Incomplete Directed Perfect Phylogeny (IDPP) problem asks,…
Gene gain-loss-duplication models are commonly based on continuous-time birth-death processes. Employed in a phylogenetic context, such models have been increasingly popular in studies of gene content evolution across multiple genomes.…
Horizontal gene transfer inference approaches are usually based on gene sequences: parametric methods search for patterns that deviate from a particular genomic signature, while phylogenetic methods use sequences to reconstruct the gene and…
To analyze whole-genome genetic data inherited in families, the likelihood is typically obtained from a Hidden Markov Model (HMM) having a state space of 2^n hidden states where n is the number of meioses or edges in the pedigree. There…
Statistically resolving the underlying haplotype pair for a genotype measurement is an important intermediate step in gene mapping studies, and has received much attention recently. Consequently, a variety of methods for this problem have…
The evolutionary dynamics of molecular populations are strongly dependent on the structure of genotype spaces. The map between genotype and phenotype determines how easily genotype spaces can be navigated and the accessibility of…
Plant traits are a key to understanding and predicting the adaptation of ecosystems to environmental changes, which motivates the TRY project aiming at constructing a global database for plant traits and becoming a standard resource for the…
Humans have $23$ pairs of homologous chromosomes. The homologous pairs are almost identical pairs of chromosomes. For the most part, differences in homologous chromosome occur at certain documented positions called single nucleotide…
We investigate the space complexity of certain perfect matching problems over bipartite graphs embedded on surfaces of constant genus (orientable or non-orientable). We show that the problems of deciding whether such graphs have (1) a…
The advent of plant phenomics, coupled with the wealth of genotypic data generated by next-generation sequencing technologies, provides exciting new resources for investigations into and improvement of complex traits. However, these new…
This paper studies the haplotype assembly problem from an information theoretic perspective. A haplotype is a sequence of nucleotide bases on a chromosome, often conveniently represented by a binary string, that differ from the bases in the…