Related papers: Extending the Overlap Graph for Gene Assembly in C…
The biological process of gene assembly has been modeled based on three types of string rewriting rules, called string pointer rules, defined on so-called legal strings. It has been shown that reduction graphs, graphs that are based on the…
Formal models for gene assembly in ciliates have been developed, in particular the string pointer reduction system (SPRS) and the graph pointer reduction system (GPRS). The reduction graph is a valuable tool within the SPRS, revealing much…
The first step in any genome assembly algorithm entails the conversion from the domain of strings and overlaps to the language of graphs and paths, typically using one of the two conventional methods: de Bruijn graphs or overlap graphs.…
DNA rearrangement processes recombine gene segments that are organized on the chromosome in a variety of ways. The segments can overlap, interleave or one may be a subsegment of another. We use directed graphs to represent segment…
Gene assembly in ciliates is one of the most involved DNA processings going on in any organism. This process transforms one nucleus (the micronucleus) into another functionally different nucleus (the macronucleus). We continue the…
We tackle the problem of attributed graph transformations and propose a new algorithmic approach for defining parallel graph transformations allowing overlaps. We start by introducing some abstract operations over graph structures. Then, we…
We describe a graph reduction operation, generalizing three graph reduction operations related to gene assembly in ciliates. The graph formalization of gene assembly considers three reduction rules, called the positive rule, double rule,…
We introduce the graph parameter readability and study it as a function of the number of vertices in a graph. Given a digraph D, an injective overlap labeling assigns a unique string to each vertex such that there is an arc from x to y if…
This paper introduces a new family of reconstruction codes which is motivated by applications in DNA data storage and sequencing. In such applications, DNA strands are sequenced by reading some subset of their substrings. While previous…
Gene assembly in ciliates is an extremely involved DNA transformation process, which transforms a nucleus, the micronucleus, to another functionally different nucleus, the macronucleus. In this paper we characterize which loop recombination…
An exact-match overlap graph of $n$ given strings of length $\ell$ is an edge-weighted graph in which each vertex is associated with a string and there is an edge $(x,y)$ of weight $\omega = \ell - |ov_{max}(x,y)|$ if and only if $\omega…
One of the most computationally intensive tasks in computational biology is de novo genome assembly, the decoding of the sequence of an unknown genome from redundant and erroneous short sequences. A common assembly paradigm identifies…
Genome assembly is a prominent problem studied in bioinformatics, which computes the source string using a set of its overlapping substrings. Classically, genome assembly uses assembly graphs built using this set of substrings to compute…
This paper introduces a new family of reconstruction codes which is motivated by applications in DNA data storage and sequencing. In such applications, DNA strands are sequenced by reading some subset of their substrings. While previous…
The basic principle of graph rewriting is the stepwise replacement of subgraphs inside a host graph. A challenge in such replacement steps is the treatment of the patch graph, consisting of those edges of the host graph that touch the…
Overlapping genes exist in all domains of life and are much more abundant than expected at their first discovery in the late 1970s. Assuming that the reference gene is read in frame +0, an overlapping gene can be encoded in two reading…
Theory of splicing is an abstract model of the recombinant behaviour of DNAs. In a splicing system, two strings to be spliced are taken from the same set and the splicing rule is from another set. Here we propose a generalised splicing (GS)…
Given a set of species whose evolution is represented by a species tree, a gene family is a group of genes having evolved from a single ancestral gene. A gene family evolves along the branches of a species tree through various mechanisms,…
We consider principal pivot transform (pivot) on graphs. We define a natural variant of this operation, called dual pivot, and show that both the kernel and the set of maximally applicable pivots of a graph are invariant under this…
We propose a new approach for modelling the process of RNA folding as a graph transformation guided by the global value of free energy. Since the folding process evolves towards a configuration in which the free energy is minimal, the…