Related papers: Generalized Unique Reconstruction from Substrings
This paper introduces a new family of reconstruction codes which is motivated by applications in DNA data storage and sequencing. In such applications, DNA strands are sequenced by reading some subset of their substrings. While previous…
The problem of string reconstruction based on its substrings spectrum has received significant attention recently due to its applicability to DNA data storage and sequencing. In contrast to previous works, we consider in this paper a setup…
This paper studies the problem of encoding messages into sequences which can be uniquely recovered from some noisy observations about their substrings. The observed reads comprise consecutive substrings with some given minimum overlap. This…
The problem of reconstructing strings from their substring spectra has a long history and in its most simple incarnation asks for determining under which conditions the spectrum uniquely determines the string. We study the problem of coded…
This paper studies reconstruction of strings based upon their substrings spectrum. Under this paradigm, it is assumed that all substrings of some fixed length are received and the goal is to reconstruct the string. While many existing works…
The problem of string reconstruction from substring information has found many applications due to its relevance in DNA- and polymer-based data storage. One practically important and challenging paradigm requires reconstructing mixtures of…
The problem of reconstructing strings from substring information has found many applications due to its importance in genomic data sequencing and DNA- and polymer-based data storage. One practically important and challenging paradigm…
We consider the problem of coding for the substring channel, in which information strings are observed only through their (multisets of) substrings. Due to existing DNA sequencing techniques and applications in DNA-based storage systems,…
Motivated by mass-spectrometry protein sequencing, we consider a simply-stated problem of reconstructing a string from the multiset of its substring compositions. We show that all strings of length 7, one less than a prime, or one less than…
As the global need for large-scale data storage is rising exponentially, existing storage technologies are approaching their theoretical and functional limits in terms of density and energy consumption, making DNA based storage a potential…
DNA as a data storage medium has several advantages, including far greater data density compared to electronic media. We propose that schemes for data storage in the DNA of living organisms may benefit from studying the reconstruction…
Motivated by applications in polymer-based data storage, we study the problem of reconstructing a string from part of its composition multiset. We give a full description of the structure of the strings that cannot be uniquely reconstructed…
DNA has immense potential as an emerging data storage medium. The principle of DNA storage is the conversion and flow of digital information between binary code stream, quaternary base, and actual DNA fragments. This process will inevitably…
DNA codes have many applications, such as in data storage, DNA computing, etc. Good DNA codes have large sizes and satisfy some certain constraints. In this paper, we present a new construction method for reversible DNA codes. We show that…
In this work, we investigate a challenging problem, which has been considered to be an important criterion in designing codewords for DNA computing purposes, namely secondary structure avoidance in single-stranded DNA molecules. In short,…
Genome sequencing is the basis for many modern biological and medicinal studies. With recent technological advances, metagenomics has become a problem of interest. This problem entails the analysis and reconstruction of multiple DNA…
The problem called "String reconstruction from substrings" is a mathematical model of sequencing by hybridization that plays an important role in DNA sequencing. In this problem, we are given a blackbox oracle holding an unknown string…
A new family of codes, called clustering-correcting codes, is presented in this paper. This family of codes is motivated by the special structure of data that is stored in DNA-based storage systems. The data stored in these systems has the…
DNA storage has emerged as an important area of research. The reliability of DNA storage system depends on designing the DNA strings (called DNA codes) that are sufficiently dissimilar. In this work, we introduce DNA codes that satisfy a…
DNA sequences are prone to creating secondary structures by folding back on themselves by non-specific hybridization among its nucleotides. The formation of secondary structures makes the sequences chemically inactive towards synthesis and…