Related papers: Computing the Inversion-Indel Distance
The computation of genomic distances has been a very active field of computational comparative genomics over the last 25 years. Substantial results include the polynomial-time computability of the inversion distance by Hannenhalli and…
The edit distance under the DCJ model can be computed in linear time for genomes with equal content or with Indels. But it becomes NP-Hard in the presence of duplications, a problem largely unsolved especially when Indels are considered. In…
Inversions, also sometimes called reversals, are a major contributor to variation among bacterial genomes, with studies suggesting that those involving small numbers of regions are more likely than larger inversions. Deletions may arise in…
A classical problem in comparative genomics is to compute the rearrangement distance, that is the minimum number of large-scale rearrangements required to transform a given genome into another given genome. While the most traditional…
Genome rearrangement distances are an established method in genome comparison. Works in this area may include various rearrangement operations representing large-scale mutations, gene orientation information, the number of nucleotides in…
The {\em double-cut-and-join} (DCJ) operation, introduced by Yancopoulos \emph{et al.}, allows minimum edit distance to be computed by modeling all possible classical rearrangement operations, such as inversions, fusions, fissions,…
Genome rearrangement has been an active area of research in computational comparative genomics for the last three decades. While initially mostly an interesting algorithmic endeavor, now the practical application by applying rearrangement…
Establishing a distance between genomes is a significant problem in computational genomics, because its solution can be used to establish evolutionary relationships including phylogeny. The "double cut and join" (DCJ) model of chromosomal…
Insertion-deletion codes (insdel codes for short) are used for correcting synchronization errors in communications, and in other many interesting fields such as DNA storage, date analysis, race-track memory error correction and language…
We use a convolutional neural network to retrieve the internuclear distance in the two-dimensional H$_2^{+}$ molecule ionized by a strong few-cycle laser pulse based on the photoelectron momentum distribution. We show that a neural network…
In this paper, we study the problem of sorting unichromosomal linear genomes by prefix double-cut-and-joins (or DCJs) in both the signed and the unsigned settings. Prefix DCJs cut the leftmost segment of a genome and any other segment, and…
Two genomes over the same set of gene families form a canonical pair when each of them has exactly one gene from each family. Different distances of canonical genomes can be derived from a structure called breakpoint graph, which represents…
Early literature on genome rearrangement modelling views the problem of computing evolutionary distances as an inherently combinatorial one. In particular, attention was given to estimating distances using the minimum number of events…
Measuring the distance between two bacterial genomes under the inversion process is usually done by assuming all inversions to occur with equal probability. Recently, an approach to calculating inversion distance using group theory was…
In comparative genomics, the rearrangement distance between two genomes (equal the minimal number of genome rearrangements required to transform them into a single genome) is often used for measuring their evolutionary remoteness.…
We investigate the application of deep learning to the retrieval of the internuclear distance in the two-dimensional H$_2^{+}$ molecule from the momentum distribution of photoelectrons produced by strong-field ionization. We study the…
Considering a pair of genomes, the goal of rearrangement distance problems is to estimate how distant these genomes are from each other based on genome rearrangements. Seminal works in genome rearrangements assumed that both genomes being…
DNA is a promising storage medium, but its stability and occurrence of Indel errors pose a significant challenge. The relative occurrence of Guanine(G) and Cytosine(C) in DNA is crucial for its longevity, and reverse complementary base…
During cancer progression, malignant cells accumulate somatic mutations that can lead to genetic aberrations. In particular, evolutionary events akin to segmental duplications or deletions can alter the copy-number profile (CNP) of a set of…
Nanopore sequencers are emerging as promising new platforms for high-throughput sequencing. As with other technologies, sequencer errors pose a major challenge for their effective use. In this paper, we present a novel information theoretic…