Related papers: Improved Lower Bounds for Constant GC-Content DNA …
We derive theoretical upper and lower bounds on the maximum size of DNA codes of length n with constant GC-content w and minimum Hamming distance d, both with and without the additional constraint that the minimum Hamming distance between…
DNA storage has emerged as an important area of research. The reliability of DNA storage system depends on designing the DNA strings (called DNA codes) that are sufficiently dissimilar. In this work, we introduce DNA codes that satisfy a…
DNA Data storage has recently attracted much attention due to its durable preservation and extremely high information density (bits per gram) properties. In this work, we propose a hybrid coding strategy comprising of generalized…
We propose coding techniques that limit the length of homopolymers runs, ensure the GC-content constraint, and are capable of correcting a single edit error in strands of nucleotides in DNA-based data storage systems. In particular, for…
As DNA data storage moves closer to practical deployment, minimizing sequencing coverage depth is essential to reduce both operational costs and retrieval latency. This paper addresses the recently studied Random Access Problem, which…
Linear error-correcting codes form the mathematical backbone of modern digital communication and storage systems, but identifying champion linear codes (linear codes achieving or exceeding the best known minimum Hamming distance) remains…
Composite DNA is a recent novel method to increase the information capacity of DNA-based data storage above the theoretical limit of 2 bits/symbol. In this method, every composite symbol does not store a single DNA nucleotide but a mixture…
Local Search problem, which finds a local minimum of a black-box function on a given graph, is of both practical and theoretical importance to combinatorial optimization, complexity theory and many other areas in theoretical computer…
We describe properties and constructions of constraint-based codes for DNA-based data storage which account for the maximum repetition length and AT/GC balance. We present algorithms for computing the number of sequences with maximum…
Constrained clustering leverages limited domain knowledge to improve clustering performance and interpretability, but incorporating pairwise must-link and cannot-link constraints is an NP-hard challenge, making global optimization…
In DNA-based data storage, DNA codes with biochemical constraints and error correction are designed to protect data reliability. Single-stranded DNA sequences with secondary structure avoidance (SSA) help to avoid undesirable secondary…
DNA strings and their properties are widely studied since last 20 years due to its applications in DNA computing. In this area, one designs a set of DNA strings (called DNA code) which satisfies certain thermodynamic and combinatorial…
DNA codes have garnered significant interest due to their utilization in digital media storage, cryptography, and DNA computing. In this paper, we first extend the results of constructing reversible group codes \cite{Cengellenmis} and…
We consider the problem of efficiently designing sets (codes) of equal-length DNA strings (words) that satisfy certain combinatorial constraints. This problem has numerous motivations including DNA computing and DNA self-assembly. Previous…
Storing digital data in synthetic DNA faces challenges in ensuring data reliability in the presence of edit errors--deletions, insertions, and substitutions--that occur randomly during various stages of the storage process. Current…
We study the amount of reliable information that can be stored in a DNA-based storage system composed of short DNA molecules. In this regime, Shomorony and Heckel (2022) put forward a conjecture on the scaling of the number of information…
Hopping cyclic codes (HCCs) are (non-linear) cyclic codes with the additional property that the $n$ cyclic shifts of every given codeword are all distinct, where $n$ is the code length. Constant weight binary hopping cyclic codes are also…
Regenerating codes allow distributed storage systems to recover from the loss of a storage node while transmitting the minimum possible amount of data across the network. We present a systematic computer search for optimal systematic…
In this paper we study error-correcting codes for the storage of data in synthetic deoxyribonucleic acid (DNA). We investigate a storage model where a data set is represented by an unordered set of $M$ sequences, each of length $L$. Errors…
The problem of fast items retrieval from a fixed collection is often encountered in most computer science areas, from operating system components to databases and user interfaces. We present an approach based on hash tables that focuses on…