English
Related papers

Related papers: Coded Shotgun Sequencing

200 papers

DNA sequencing is the basic workhorse of modern day biology and medicine. Shotgun sequencing is the dominant technique used: many randomly located short fragments called reads are extracted from the DNA sequence, and these reads are…

Information Theory · Computer Science 2013-02-15 Abolfazl Motahari , Guy Bresler , David Tse

The prevalent technique for DNA sequencing consists of two main steps: shotgun sequencing, where many randomly located fragments, called reads, are extracted from the overall sequence, followed by an assembly algorithm that aims to…

Genomics · Quantitative Biology 2016-01-28 Shirshendu Ganguly , Elchanan Mossel , Miklos Z. Racz

Genome sequencing is the basis for many modern biological and medicinal studies. With recent technological advances, metagenomics has become a problem of interest. This problem entails the analysis and reconstruction of multiple DNA…

Probability · Mathematics 2022-01-14 Marlee Herring

The shotgun sequencing process involves fragmenting a long DNA sequence (input string) into numerous shorter, unordered, and overlapping segments (referred to as \emph{reads}). The reads are sequenced, and later aligned to reconstruct the…

Information Theory · Computer Science 2025-09-26 Mohammed Ihsan Ali , Hrishi Narayanan , Prasad Krishnan

In shotgun sequencing, the input string (typically, a long DNA sequence composed of nucleotide bases) is sequenced as multiple overlapping fragments of much shorter lengths (called \textit{reads}). Modelling the shotgun sequencing pipeline…

Information Theory · Computer Science 2024-05-14 Hrishi Narayanan , Prasad Krishnan , Nita Parekh

We study permutations over the set of $\ell$-grams, that are feasible in the sense that there is a sequence whose $\ell$-gram frequency has the same ranking as the permutation. Codes, which are sets of feasible permutations, protect…

Information Theory · Computer Science 2021-01-18 Niv Beeri , Moshe Schwartz

Current techniques in sequencing a genome allow a service provider (e.g. a sequencing company) to have full access to the genome information, and thus the privacy of individuals regarding their lifetime secret is violated. In this paper, we…

Genomics · Quantitative Biology 2018-11-28 Ali Gholami , Mohammad Ali Maddah-Ali , Seyed Abolfazl Motahari

The DNA storage channel is considered, in which a codeword is comprised of $M$ unordered DNA molecules. At reading time, $N$ molecules are sampled with replacement, and then each molecule is sequenced. A coded-index concatenated-coding…

Information Theory · Computer Science 2022-05-23 Nir Weinberger

The DNA storage channel is considered, in which the $M$ Deoxyribonucleic acid (DNA) molecules comprising each codeword are stored without order, sampled $N$ times with replacement, and then sequenced over a discrete memoryless channel. For…

Information Theory · Computer Science 2022-02-15 Nir Weinberger , Neri Merhav

DNA sequencing has faced a huge demand since it was first introduced as a service to the public. This service is often offloaded to the sequencing companies who will have access to full knowledge of individuals' sequences, a major violation…

Information Theory · Computer Science 2019-04-04 Ali Gholami , Mohammad Ali Maddah-Ali , Seyed Abolfazl Motahari

Synthesis of DNA molecules offers unprecedented advances in storage technology. Yet, the microscopic world in which these molecules reside induces error patterns that are fundamentally different from their digital counterparts. Hence, to…

Information Theory · Computer Science 2017-08-08 Netanel Raviv , Moshe Schwartz , Eitan Yaakobi

We present a framework for the design of optimal assembly algorithms for shotgun sequencing under the criterion of complete reconstruction. We derive a lower bound on the read length and the coverage depth required for reconstruction in…

Genomics · Quantitative Biology 2013-02-20 Guy Bresler , Ma'ayan Bresler , David Tse

Although the expenses associated with DNA sequencing have been rapidly decreasing, the current cost of sequencing information stands at roughly $120/GB, which is dramatically more expensive than reading from existing archival storage…

Discrete Mathematics · Computer Science 2023-11-30 Daniella Bar-Lev , Omer Sabary , Ryan Gabrys , Eitan Yaakobi

As DNA data storage moves closer to practical deployment, minimizing sequencing coverage depth is essential to reduce both operational costs and retrieval latency. This paper addresses the recently studied Random Access Problem, which…

Information Theory · Computer Science 2026-01-13 Chen Wang , Eitan Yaakobi

DNA is a leading candidate as the next archival storage media due to its density, durability and sustainability. To read (and write) data DNA storage exploits technology that has been developed over decades to sequence naturally occurring…

Emerging Technologies · Computer Science 2022-05-12 Jasmine Quah , Omer Sella , Thomas Heinis

DNA data storage systems encode digital data into DNA strands, enabling dense and durable storage. Efficient data retrieval depends on coverage depth, a key performance metric. We study the random access coverage depth problem and focus on…

Information Theory · Computer Science 2025-07-29 Şeyma Bodur , Stefano Lia , Hiram H. López , Rati Ludhani , Alberto Ravagnani , Lisa Seccia

We study the amount of reliable information that can be stored in a DNA-based storage system with noisy sequencing, where each codeword is composed of short DNA molecules. We analyze a concatenated coding scheme, where the outer code is…

Information Theory · Computer Science 2026-05-19 Ran Tamir , Nir Weinberger , Albert Guillén i Fàbregas

DNA storage is now being considered as a new archival storage method for its durability and high information density, but still facing some challenges like high costs and low throughput. By reducing sequencing sample size for decoding…

Information Theory · Computer Science 2025-04-22 Ruiying Cao , Xin Chen

DNA has immense potential as an emerging data storage medium. The principle of DNA storage is the conversion and flow of digital information between binary code stream, quaternary base, and actual DNA fragments. This process will inevitably…

Information Retrieval · Computer Science 2022-10-21 Yun Qin , Fei Zhu , Bo Xi

The coverage depth problem in DNA data storage is about minimizing the expected number of reads until all data is recovered. When they exist, MDS codes offer the best performance in this context. This paper focuses on the scenario where the…

Information Theory · Computer Science 2025-07-29 Matteo Bertuzzo , Alberto Ravagnani , Eitan Yaakobi
‹ Prev 1 2 3 10 Next ›