Related papers: Codes for DNA Sequence Profiles
We consider the problem of assembling a sequence based on a collection of its substrings observed through a noisy channel. The mathematical basis of the problem is the construction and design of sequences that may be discriminated based on…
We consider the problem of coding for the substring channel, in which information strings are observed only through their (multisets of) substrings. Due to existing DNA sequencing techniques and applications in DNA-based storage systems,…
DNA as a data storage medium has several advantages, including far greater data density compared to electronic media. We propose that schemes for data storage in the DNA of living organisms may benefit from studying the reconstruction…
DNA storage is now being considered as a new archival storage method for its durability and high information density, but still facing some challenges like high costs and low throughput. By reducing sequencing sample size for decoding…
This paper studies the problem of encoding messages into sequences which can be uniquely recovered from some noisy observations about their substrings. The observed reads comprise consecutive substrings with some given minimum overlap. This…
We provide an overview of current approaches to DNA-based storage system design and accompanying synthesis, sequencing and editing methods. We also introduce and analyze a suite of new constrained coding schemes for both archival and random…
In this review paper, we delve into the nascent field of molecular data storage, focusing on system implementations and code constructions. We start by providing an overview of basic concepts in synthetic and computational biology.…
Due to its longevity and enormous information density, DNA is an attractive medium for archival data storage. Thanks to rapid technological advances, DNA storage is becoming practically feasible, as demonstrated by a number of experimental…
Due to the redundant nature of DNA synthesis and sequencing technologies, a basic model for a DNA storage system is a multi-draw "shuffling-sampling" channel. In this model, a random number of noisy copies of each sequence is observed at…
Recent experiments have demonstrated the feasibility of storing digital information in macromolecules such as DNA and protein. However, the DNA storage channel is prone to errors such as deletions, insertions, and substitutions. During the…
Synthetic DNA approaches 227.5 exabytes per gram of storage density with stability over millennial timescales. Realising this capacity requires error-correction codes that recover data from substantial synthesis and sequencing errors.…
Motivated by DNA-based storage applications, we study the problem of reconstructing a coded sequence from multiple traces. We consider the model where the traces are outputs of independent deletion channels, where each channel deletes each…
DNA storage has emerged as a promising solution for large-scale and long-term data preservation. Among various error types, insertions are the most frequent errors occurring in DNA sequences, where the inserted symbol is often identical or…
DNA storage has emerged as an important area of research. The reliability of DNA storage system depends on designing the DNA strings (called DNA codes) that are sufficiently dissimilar. In this work, we introduce DNA codes that satisfy a…
We study the amount of reliable information that can be stored in a DNA-based storage system with noisy sequencing, where each codeword is composed of short DNA molecules. We analyze a concatenated coding scheme, where the outer code is…
The process of DNA-based data storage (DNA storage for short) can be mathematically modelled as a communication channel, termed DNA storage channel, whose inputs and outputs are sets of unordered sequences. To design error correcting codes…
To increase the information capacity of DNA storage, composite DNA letters were introduced. We propose a novel channel model for composite DNA in which composite sequences are decomposed into ordered standard non-composite sequences. The…
Due to its longevity and enormous information density, DNA is an attractive medium for archival storage. In this work, we study the fundamental limits and trade-offs of DNA-based storage systems by introducing a new channel model, which we…
DNA has immense potential as an emerging data storage medium. The principle of DNA storage is the conversion and flow of digital information between binary code stream, quaternary base, and actual DNA fragments. This process will inevitably…
The synthesis of DNA strands remains the most costly part of the DNA storage system. Thus, to make DNA storage system more practical, the time and materials used in the synthesis process have to be optimized. We consider the most common…