Related papers: Static Virus Spread Algorithm for DNA Sequence Des…
DNA code design aims to generate a set of DNA sequences (codewords) with minimum likelihood of undesired hybridizations among sequences and their reverse-complement (RC) pairs (cross-hybridization). Inspired by the distinct hybridization…
We provide an overview of current approaches to DNA-based storage system design and accompanying synthesis, sequencing and editing methods. We also introduce and analyze a suite of new constrained coding schemes for both archival and random…
The task of RNA design given a target structure aims to find a sequence that can fold into that structure. It is a computationally hard problem where some version(s) have been proven to be NP-hard. As a result, heuristic methods such as…
DNA-based storage is an emerging technology that enables digital information to be archived in DNA molecules. This method enjoys major advantages over magnetic and optical storage solutions such as exceptional information density, enhanced…
DNA storage is now being considered as a new archival storage method for its durability and high information density, but still facing some challenges like high costs and low throughput. By reducing sequencing sample size for decoding…
Recent advances in next-generation sequencing technologies have facilitated the use of deoxyribonucleic acid (DNA) as a novel covert channels in steganography. There are various methods that exist in other domains to detect hidden messages…
Algorithms for machine learning-guided design, or design algorithms, use machine learning-based predictions to propose novel objects with desired property values. Given a new design task -- for example, to design novel proteins with high…
The paper describes an algorithm to compute a consensus sequence from a set of DNA sequences of approximatively identical length generated by 3rd sequencing generation technologies. Its purpose targets DNA storage and is guided by specific…
The process of DNA-based data storage (DNA storage for short) can be mathematically modelled as a communication channel, termed DNA storage channel, whose inputs and outputs are sets of unordered sequences. To design error correcting codes…
We consider the problem of efficiently designing sets (codes) of equal-length DNA strings (words) that satisfy certain combinatorial constraints. This problem has numerous motivations including DNA computing and DNA self-assembly. Previous…
DNA-based storage offers unprecedented density and durability, but its scalability is fundamentally limited by the efficiency of parallel strand synthesis. Existing methods either allow unconstrained nucleotide additions to individual…
The advent of deep learning has introduced efficient approaches for de novo protein sequence design, significantly improving success rates and reducing development costs compared to computational or experimental methods. However, existing…
DNA sequence alignment is important today as it is usually the first step in finding gene mutation, evolutionary similarities, protein structure, drug development and cancer treatment. Covid-19 is one recent example. There are many…
Protein sequence design is a challenging problem in protein engineering, which aims to discover novel proteins with useful biological functions. Directed evolution is a widely-used approach for protein sequence design, which mimics the…
DNA sequencing is revolutionising the field of medicine. DNA sequencers, the machines which perform DNA sequencing, have evolved from the size of a fridge to that of a mobile phone over the last two decades. The cost of sequencing a human…
According to the no-free-lunch theorem, there is no single meta-heuristic algorithm that can optimally solve all optimization problems. This motivates many researchers to continuously develop new optimization algorithms. In this paper, a…
DNA has the potential to realize a controllable liquid-liquid phase separation (LLPS) system, because the design of its base sequences results in programmable interactions. Here, we have developed a novel DNA-based LLPS system which enables…
RNA design aims to identify RNA sequences that fold into a target secondary structure. This task is challenging in terms of computational efficiency. Most existing methods focus on either minimum free energy (MFE)-based or ensemble-based…
Meta-heuristic algorithms are widely used to tackle complex optimization problems, including nonlinear, multimodal, and high-dimensional tasks. However, many existing methods suffer from premature convergence, limited exploration, and…
Recent studies reveal even the smallest genomes such as viruses evolve through complex and stochastic processes, and the assumption of independent alleles is not valid in most applications. Advances in sequencing technologies produce…