Related papers: Ultra-fast Multiple Genome Sequence Matching Using…
Traditionally, we usually utilize the method of shotgun to cut a DNA sequence into pieces and we have to reconstruct the original DNA sequence from the pieces, those are widely used method for DNA assembly. Emerging DNA sequence…
Since the release of human genome sequences, one of the most important research issues is about indexing the genome sequences, and the suffix tree is most widely adopted for that purpose. The traditional suffix tree construction algorithms…
Genetic Programming (GP) is a computationally intensive technique which also has a high degree of natural parallelism. Parallel computing architectures have become commonplace especially with regards Graphics Processing Units (GPU). Hence,…
Matrix multiplication is a foundational operation in scientific computing and machine learning, yet its computational complexity makes it a significant bottleneck for large-scale applications. The shift to parallel architectures, primarily…
Genetic Programming (GP), an evolutionary learning technique, has multiple applications in machine learning such as curve fitting, data modelling, feature selection, classification etc. GP has several inherent parallel steps, making it an…
Multiple matching algorithms are used to locate the occurrences of patterns from a finite pattern set in a large input string. Aho-Corasick and Wu-Manber, two of the most well known algorithms for multiple matching require an increased…
Analysis of DNA samples is an important step in forensics, and the speed of analysis can impact investigations. Comparison of DNA sequences is based on the analysis of short tandem repeats (STRs), which are short DNA sequences of 2-5 base…
Genetic Programming (GP) is a computationally intensive technique which is naturally parallel in nature. Consequently, many attempts have been made to improve its run-time from exploiting highly parallel hardware such as GPUs. However, a…
I present a new GPU implementation of the wavelet tree data structure. It includes binary rank and select support structures that provide at least 10 times higher throughput of binary rank and select queries than the best publicly available…
Genetic information is increasing exponentially, doubling every 18 months. Analyzing this information within a reasonable amount of time requires parallel computing resources. While considerable research has addressed DNA analysis using…
Computational Pangenomics is an emerging field that studies genetic variation using a graph structure encompassing multiple genomes. Visualizing pangenome graphs is vital for understanding genome diversity. Yet, handling large graphs can be…
The suffix tree is a data structure for indexing strings. It is used in a variety of applications such as bioinformatics, time series analysis, clustering, text editing and data compression. However, when the string and the resulting suffix…
Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures. This paper tackles…
Genetic Algorithms (GAs) are used to solve search and optimization problems in which an optimal solution can be found using an iterative process with probabilistic and non-deterministic transitions. However, depending on the problem's…
Graph coloring has been broadly used to discover concurrency in parallel computing. To speedup graph coloring for large-scale datasets, parallel algorithms have been proposed to leverage modern GPUs. Existing GPU implementations either have…
Motif discovery in DNA sequences is a challenging task in molecular biology. In computational motif discovery, Planted (l, d) motif finding is a widely studied problem and numerous algorithms are available to solve it. Both hardware and…
Huge amount of data in the form of strings are being handled in bio-computing applications and searching algorithms are quite frequently used in them. Many methods utilizing on both software and hardware are being proposed to accelerate…
The simplex algorithm has been successfully used for many years in solving linear programming (LP) problems. Due to the intensive computations required (especially for the solution of large LP problems), parallel approaches have also…
The search for similar genetic sequences is one of the main bioinformatics tasks. The genetic sequences data banks are growing exponentially and the searching techniques that use linear time are not capable to do the search in the required…
Similarity search, the task of identifying objects most similar to a given query object under a specific metric, has gathered significant attention due to its practical applications. However, the absence of coordinate information to…