Related papers: Fast low-level pattern matching algorithm

DNA Pattern Matching Acceleration with Analog Resistive CAM

DNA pattern matching is essential for many widely used bioinformatics applications. Disease diagnosis is one of these applications, since analyzing changes in DNA sequences can increase our understanding of possible genetic diseases. The…

Hardware Architecture · Computer Science 2022-06-01 Jinane Bazzi , Jana Sweidan , Mohammed E. Fouda , Rouwaida Kanj , Ahmed M. Eltawil

New Sequence Alignment Algorithm using AI Rules and Dynamic Seeds

DNA sequence alignment is important today as it is usually the first step in finding gene mutation, evolutionary similarities, protein structure, drug development and cancer treatment. Covid-19 is one recent example. There are many…

Genomics · Quantitative Biology 2023-06-01 Suchindra , Preetam Nagaraj

A Compact Index for Order-Preserving Pattern Matching

Order-preserving pattern matching was introduced recently but it has already attracted much attention. Given a reference sequence and a pattern, we want to locate all substrings of the reference sequence whose elements have the same…

Data Structures and Algorithms · Computer Science 2018-12-11 Gianni Decaroli , Travis Gagie , Giovanni Manzini

Motif matching using gapped patterns

We present new algorithms for the problem of multiple string matching of gapped patterns, where a gapped pattern is a sequence of strings such that there is a gap of fixed length between each two consecutive strings. The problem has…

Data Structures and Algorithms · Computer Science 2014-07-08 Emanuele Giaquinta , Kimmo Fredriksson , Szymon Grabowski , Alexandru I. Tomescu , Esko Ukkonen

Linear Approximate Pattern Matching Algorithm

Pattern matching is a fundamental process in almost every scientific domain. The problem involves finding the positions of a given pattern (usually of short length) in a reference stream of data (usually of large length). The matching can…

Data Structures and Algorithms · Computer Science 2022-07-01 Anas Al-okaily , Abdelghani Tbakhi

Coding for Optimized Writing Rate in DNA Storage

A method for encoding information in DNA sequences is described. The method is based on the precision-resolution framework, and is aimed to work in conjunction with a recently suggested terminator-free template independent DNA synthesis…

Information Theory · Computer Science 2020-05-14 Siddharth Jain , Farzad Farnoud , Moshe Schwartz , Jehoshua Bruck

A Fast Generic Sequence Matching Algorithm

A string matching -- and more generally, sequence matching -- algorithm is presented that has a linear worst-case computing time bound, a low worst-case bound on the number of comparisons (2n), and sublinear average-case behavior that is…

Data Structures and Algorithms · Computer Science 2008-10-02 David R. Musser , Gor V. Nishanov

Fast differentiable DNA and protein sequence optimization for molecular design

Designing DNA and protein sequences with improved function has the potential to greatly accelerate synthetic biology. Machine learning models that accurately predict biological fitness from sequence are becoming a powerful tool for…

Machine Learning · Computer Science 2022-03-17 Johannes Linder , Georg Seelig

Eliminating unwanted patterns with minimal interference

Artificial synthesis of DNA molecules is an essential part of the study of biological mechanisms. The design of a synthetic DNA molecule usually involves many objectives. One of the important objectives is to eliminate short sequence…

Biomolecules · Quantitative Biology 2021-08-13 Zehavit Leibovich , Ilan Gronau

A family of fast exact pattern matching algorithms

A family of comparison-based exact pattern matching algorithms is described. They utilize multi-dimensional arrays in order to process more than one adjacent text window in each iteration of the search cycle. This approach leads to a lower…

Data Structures and Algorithms · Computer Science 2016-08-31 Igor O. Zavadskyi

An Efficient Genetic Algorithm for Discovering Diverse-Frequent Patterns

Working with exhaustive search on large dataset is infeasible for several reasons. Recently, developed techniques that made pattern set mining feasible by a general solver with long execution time that supports heuristic search and are…

Artificial Intelligence · Computer Science 2015-07-21 Shanjida Khatun , Hasib Ul Alam , Swakkhar Shatabda

Structure Learning of Deep Networks via DNA Computing Algorithm

Convolutional Neural Network (CNN) has gained state-of-the-art results in many pattern recognition and computer vision tasks. However, most of the CNN structures are manually designed by experienced researchers. Therefore, auto- matically…

Neural and Evolutionary Computing · Computer Science 2018-10-26 Guoqiang Zhong , Tao Li , Wenxue Liu , Yang Chen

DNA Sequence Alignment by Window based Optical Correlator

In genomics, pattern matching against a sequence of nucleotides plays a pivotal role for DNA sequence alignment and comparing genomes. This helps tackling some diseases, such as cancer in humans. The complexity of searching biological…

Quantitative Methods · Quantitative Biology 2017-10-04 Fereshte Mozafari , Hossein Babashah , Somayyeh Koohi , Zahra Kavehvash

Iterative Learning for Reference-Guided DNA Sequence Assembly from Short Reads: Algorithms and Limits of Performance

Recent emergence of next-generation DNA sequencing technology has enabled acquisition of genetic information at unprecedented scales. In order to determine the genetic blueprint of an organism, sequencing platforms typically employ…

Genomics · Quantitative Biology 2015-06-19 Xiaohu Shen , Manohar Shamaiah , Haris Vikalo

In Search of Lost DNA Sequence Pretraining

DNA sequence encoding is fundamental to gene function prediction, protein synthesis, and diverse downstream biological tasks. Despite the substantial progress achieved by large-scale DNA sequence pretraining, existing studies have…

Machine Learning · Computer Science 2026-04-21 Zhijiang Tang , Jiaxin Qi , Yan Cui , Jinli Ou , Yuhua Zheng , Jianqiang Huang

Training for Fast Sequential Prediction Using Dynamic Feature Selection

We present paired learning and inference algorithms for significantly reducing computation and increasing speed of the vector dot products in the classifiers that are at the heart of many NLP components. This is accomplished by partitioning…

Computation and Language · Computer Science 2014-12-23 Emma Strubell , Luke Vilnis , Andrew McCallum

A Subsequence Interleaving Model for Sequential Pattern Mining

Recent sequential pattern mining methods have used the minimum description length (MDL) principle to define an encoding scheme which describes an algorithm for mining the most compressing patterns in a database. We present a novel…

Machine Learning · Statistics 2016-11-14 Jaroslav Fowkes , Charles Sutton

A Fixed-Length Coding Algorithm for DNA Sequence Compression

While achieving a compression ratio of 2.0 bits/base, the new algorithm codes non-N bases in fixed length. It dramatically reduces the time of coding and decoding than previous DNA compression algorithms and some universal compression…

Information Theory · Computer Science 2007-07-16 Jie Liu , Sheng Bao , Zhiqiang Jing , Shi Chen

A Genetic Algorithm for Obtaining Memory Constrained Near-Perfect Hashing

The problem of fast items retrieval from a fixed collection is often encountered in most computer science areas, from operating system components to databases and user interfaces. We present an approach based on hash tables that focuses on…

Neural and Evolutionary Computing · Computer Science 2020-07-17 Dan Domnita , Ciprian Oprisa

Learning Dynamic Feature Selection for Fast Sequential Prediction

We present paired learning and inference algorithms for significantly reducing computation and increasing speed of the vector dot products in the classifiers that are at the heart of many NLP components. This is accomplished by partitioning…

Computation and Language · Computer Science 2015-05-25 Emma Strubell , Luke Vilnis , Kate Silverstein , Andrew McCallum