Related papers: Coded trace reconstruction
Motivated by DNA-based storage applications, we study the problem of reconstructing a coded sequence from multiple traces. We consider the model where the traces are outputs of independent deletion channels, where each channel deletes each…
The coded trace reconstruction problem asks to construct a code $C\subset \{0,1\}^n$ such that any $x\in C$ is recoverable from independent outputs ("traces") of $x$ from a binary deletion channel (BDC). We present binary codes of rate…
We consider error-correcting coding for deoxyribonucleic acid (DNA)-based storage using nanopore sequencing. We model the DNA storage channel as a sampling noise channel where the input data is chunked into $M$ short DNA strands, which are…
In the \emph{trace reconstruction problem}, an unknown source string $x \in \{0,1\}^n$ is transmitted through a probabilistic \emph{deletion channel} which independently deletes each bit with some fixed probability $\delta$ and concatenates…
Encoding data as a set of unordered strings is receiving great attention as it captures one of the basic features of DNA storage systems. However, the challenge of constructing optimal redundancy codes for this channel remained elusive. In…
Trace reconstruction is the problem of learning an unknown string $x$ from independent traces of $x$, where traces are generated by independently deleting each bit of $x$ with some deletion probability $q$. In this paper, we initiate the…
Due to its higher data density, longevity, energy efficiency, and ease of generating copies, DNA is considered a promising storage technology for satisfying future needs. However, a diverse set of errors including deletions, insertions,…
Consider a binary word being transmitted through a communication channel that introduces deletable errors where each bit of the word is either retained, flipped, erased or deleted. The simplest code for correcting \emph{all} possible…
In this paper, we derive an expression for the expected number of runs in a trace of a binary sequence $x \in \{0,1\}^n$ obtained by passing $x$ through a deletion channel that independently deletes each bit with probability $q$. We use…
DNA Data storage has recently attracted much attention due to its durable preservation and extremely high information density (bits per gram) properties. In this work, we propose a hybrid coding strategy comprising of generalized…
DNA, with remarkable properties of high density, durability, and replicability, is one of the most appealing storage media. Emerging DNA storage technologies use composite DNA letters, where information is represented by probability…
We describe a strategy for constructing codes for DNA-based information storage by serial composition of weighted finite-state transducers. The resulting state machines can integrate correction of substitution errors; synchronization by…
Large-scale distributed storage systems typically use erasure codes to provide durability of data in the face of failures. A set of $k$ blocks to be stored is encoded using an $[n, k]$ code to generate $n$ blocks that are then stored on…
Nanopore sequencing, superior to other sequencing technologies for DNA storage in multiple aspects, has recently attracted considerable attention. Its high error rates, however, demand thorough research on practical and efficient coding…
The insertion-deletion channel takes as input a bit string ${\bf x}\in\{0,1\}^{n}$, and outputs a string where bits have been deleted and inserted independently at random. The trace reconstruction problem is to recover $\bf x$ from many…
Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing…
Most DNA sequencing technologies are based on the shotgun paradigm: many short reads are obtained from random unknown locations in the DNA sequence. A fundamental question, studied in arXiv:1203.6233, is what read length and coverage depth…
In the standard trace reconstruction problem, the goal is to \emph{exactly} reconstruct an unknown source string $\mathsf{x} \in \{0,1\}^n$ from independent "traces", which are copies of $\mathsf{x}$ that have been corrupted by a…
A \emph{trace} of a sequence is generated by deleting each bit of the sequence independently with a fixed probability. The well-studied \emph{trace reconstruction} problem asks how many traces are required to reconstruct an unknown binary…
The amount of digital data is rapidly growing. There is an increasing use of a wide range of computer systems, from mobile devices to large-scale data centers, and important for reliable operation of all computer systems is mitigating the…