English
Related papers

Related papers: Constrained Consensus Sequence Algorithm for DNA A…

200 papers

Modern biological science produces vast amounts of genomic sequence data. This is fuelling the need for efficient algorithms for sequence compression and analysis. Data compression and the associated techniques coming from information…

Data Structures and Algorithms · Computer Science 2011-09-05 Heba Afify , Muhammad Islam , Manal Abdel Wahed

While achieving a compression ratio of 2.0 bits/base, the new algorithm codes non-N bases in fixed length. It dramatically reduces the time of coding and decoding than previous DNA compression algorithms and some universal compression…

Information Theory · Computer Science 2007-07-16 Jie Liu , Sheng Bao , Zhiqiang Jing , Shi Chen

We provide an overview of current approaches to DNA-based storage system design and accompanying synthesis, sequencing and editing methods. We also introduce and analyze a suite of new constrained coding schemes for both archival and random…

Emerging Technologies · Computer Science 2015-07-08 S. M. Hossein Tabatabaei Yazdi , Han Mao Kiah , Eva Ruiz Garcia , Jian Ma , Huimin Zhao , Olgica Milenkovic

DNA synthesis is considered as one of the most expensive components in current DNA storage systems. In this paper, focusing on a common synthesis machine, which generates multiple DNA strands in parallel following a fixed supersequence,we…

Information Theory · Computer Science 2025-05-13 Yajuan Liu , Tolga M. Duman

The process of DNA-based data storage (DNA storage for short) can be mathematically modelled as a communication channel, termed DNA storage channel, whose inputs and outputs are sets of unordered sequences. To design error correcting codes…

Information Theory · Computer Science 2020-06-11 Wentu Song , Kui Cai , Kees A. Schouhamer Immink

This study proposes a data condensation method for multivariate kernel density estimation by genetic algorithm. First, our proposed algorithm generates multiple subsamples of a given size with replacement from the original sample. The…

Methodology · Statistics 2022-03-04 Kiheiji Nishida

We describe properties and constructions of constraint-based codes for DNA-based data storage which account for the maximum repetition length and AT/GC balance. We present algorithms for computing the number of sequences with maximum…

Information Theory · Computer Science 2018-12-18 Kees A. Schouhamer Immink , Kui Cai

A method for encoding information in DNA sequences is described. The method is based on the precision-resolution framework, and is aimed to work in conjunction with a recently suggested terminator-free template independent DNA synthesis…

Information Theory · Computer Science 2020-05-14 Siddharth Jain , Farzad Farnoud , Moshe Schwartz , Jehoshua Bruck

Accurate genome sequencing can improve our understanding of biology and the genetic basis of disease. The standard approach for generating DNA sequences from PacBio instruments relies on HMM-based models. Here, we introduce Distilled…

DNA Data storage has recently attracted much attention due to its durable preservation and extremely high information density (bits per gram) properties. In this work, we propose a hybrid coding strategy comprising of generalized…

Information Theory · Computer Science 2021-12-20 Yixin Wang , Li Deng , Md. Noor-A-Rahim , Erry Gunawan , Yong L. Guan , Zhi P. Shi , Chueh L. Poh

Constrained sequential pattern mining aims at identifying frequent patterns on a sequential database of items while observing constraints defined over the item attributes. We introduce novel techniques for constraint-based sequential…

Machine Learning · Computer Science 2019-01-01 Amin Hosseininasab , Willem-Jan van Hoeve , Andre A. Cire

This paper introduces a new solution to DNA storage that integrates all three steps of retrieval, namely clustering, reconstruction, and error correction. DNA-correcting codes are presented as a unique solution to the problem of ensuring…

Information Theory · Computer Science 2024-07-02 Avital Boruchovsky , Daniella Bar-Lev , Eitan Yaakobi

Sequencing by synthesis is the underlying technology for many next-generation DNA sequencing platforms. We developed a new model, the fixed flow cycle model, to derive the distributions of sequence length for a given number of flow cycles…

Genomics · Quantitative Biology 2024-05-28 Yong Kong

A distributed computing system is a collection of processors that communicate either by reading and writing from a shared memory or by sending messages over some communication network. Most prior biologically inspired distributed computing…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-15 Sabrina Rashid , Gadi Taubenfeld , Ziv Bar-Joseph

We live in a period where bio-informatics is rapidly expanding, a significant quantity of genomic data has been produced as a result of the advancement of high-throughput genome sequencing technology, raising concerns about the costs…

Quantitative Methods · Quantitative Biology 2023-03-10 Mehedi Hasan Sarkar , Adnan Ferdous Ashrafi

The community structure of complex networks reveals both their organization and hidden relationships among their constituents. Most community detection methods currently available are not deterministic, and their results typically depend on…

Physics and Society · Physics 2012-03-29 Andrea Lancichinetti , Santo Fortunato

In this paper, we propose a novel iterative encoding algorithm for DNA storage to satisfy both the GC balance and run-length constraints using a greedy algorithm. DNA strands with run-length more than three and the GC balance ratio far from…

Information Theory · Computer Science 2023-01-04 Seong-Joon Park , Yongwoo Lee , Jong-Seon No

We consider the problem of assembling a sequence based on a collection of its substrings observed through a noisy channel. The mathematical basis of the problem is the construction and design of sequences that may be discriminated based on…

Information Theory · Computer Science 2015-11-04 Han Mao Kiah , Gregory J. Puleo , Olgica Milenkovic

DNA-based storage is an emerging technology that enables digital information to be archived in DNA molecules. This method enjoys major advantages over magnetic and optical storage solutions such as exceptional information density, enhanced…

Information Theory · Computer Science 2024-03-13 Daniella Bar-Lev , Itai Orr , Omer Sabary , Tuvi Etzion , Eitan Yaakobi

DNA sequencing technology has advanced to a point where storage is becoming the central bottleneck in the acquisition and mining of more data. Large amounts of data are vital for genomics research, and generic compression tools, while…

Information Theory · Computer Science 2016-11-15 Bobbie Chern , Idoia Ochoa , Alexandros Manolakos , Albert No , Kartik Venkat , Tsachy Weissman
‹ Prev 1 2 3 10 Next ›