Related papers: Efficient Approximation Algorithms for String Kern…

Efficient Approximate Kernel Based Spike Sequence Classification

Machine learning (ML) models, such as SVM, for tasks like classification and clustering of sequences, require a definition of distance/similarity between pairs of sequences. Several methods have been proposed to compute the similarity…

Machine Learning · Computer Science 2022-09-13 Sarwan Ali , Bikram Sahoo , Muhammad Asad Khan , Alexander Zelikovsky , Imdad Ullah Khan , Murray Patterson

Sequence Alignment Algorithm for Statistical Similarity Assessment

This paper presents a new approach to statistical similarity assessment based on sequence alignment. The algorithm performs mutual matching of two random sequences by successively searching for common elements and by applying sequence…

Signal Processing · Electrical Eng. & Systems 2021-06-09 Jakub Nikonowicz , Łukasz Matuszewski , Paweł Kubczak

Learning from String Sequences

The Universal Similarity Metric (USM) has been demonstrated to give practically useful measures of "similarity" between sequence data. Here we have used the USM as an alternative distance metric in a K-Nearest Neighbours (K-NN) learner to…

Machine Learning · Computer Science 2024-05-13 David Lindsay , Sian Lindsay

Approximation Vector Machines for Large-scale Online Learning

One of the most challenging problems in kernel online learning is to bound the model size and to promote the model sparsity. Sparse models not only improve computation and memory usage, but also enhance the generalization capacity, a…

Machine Learning · Computer Science 2017-05-30 Trung Le , Tu Dinh Nguyen , Vu Nguyen , Dinh Phung

Memory and Computation-Efficient Kernel SVM via Binary Embedding and Ternary Model Coefficients

Kernel approximation is widely used to scale up kernel SVM training and prediction. However, the memory and computation costs of kernel approximation models are still too high if we want to deploy them on memory-limited devices such as…

Machine Learning · Computer Science 2020-10-07 Zijian Lei , Liang Lan

Finite Width Model Sequence Comparison

Sequence comparison is a widely used computational technique in modern molecular biology. In spite of the frequent use of sequence comparisons the important problem of assigning statistical significance to a given degree of similarity is…

Quantitative Methods · Quantitative Biology 2007-05-23 Ralf Bundschuh , Nicholas Chia

Text Classification with Compression Algorithms

This work concerns a comparison of SVM kernel methods in text categorization tasks. In particular I define a kernel function that estimates the similarity between two objects computing by their compressed lengths. In fact, compression…

Machine Learning · Computer Science 2012-10-30 Antonio Giuliano Zippo

Efficient Approximate Search for Sets of Vectors

We consider a similarity measure between two sets $A$ and $B$ of vectors, that balances the average and maximum cosine distance between pairs of vectors, one from set $A$ and one from set $B$. As a motivation for this measure, we present…

Data Structures and Algorithms · Computer Science 2021-08-31 Michael Leybovich , Oded Shmueli

Efficient multivariate sequence classification

Kernel-based approaches for sequence classification have been successfully applied to a variety of domains, including the text categorization, image classification, speech analysis, biological sequence analysis, time series and music…

Machine Learning · Computer Science 2014-10-01 Pavel P. Kuksa

Formal Languages and Algorithms for Similarity based Retrieval from Sequence Databases

The paper considers various formalisms based on Automata, Temporal Logic and Regular Expressions for specifying queries over sequences. Unlike traditional binary semantics, the paper presents a similarity based semantics for thse…

Logic in Computer Science · Computer Science 2007-05-23 A. Prasad Sistla

A Comparative Study on String Matching Algorithm of Biological Sequences

String matching algorithm plays the vital role in the Computational Biology. The functional and structural relationship of the biological sequence is determined by similarities on that sequence. For that, the researcher is supposed to aware…

Data Structures and Algorithms · Computer Science 2014-01-30 Pandiselvam. P , Marimuthu. T , Lawrance. R

Quantum Time Series Similarity Measures and Quantum Temporal Kernels

This article presents a quantum computing approach to designing of similarity measures and kernels for classification of stochastic symbolic time series. In the area of machine learning, kernels are important components of various…

Quantum Physics · Physics 2025-06-10 Vanio Markov , Vladimir Rastunkov , Daniel Fry

Improved Algorithms for Approximate String Matching (Extended Abstract)

The problem of approximate string matching is important in many different areas such as computational biology, text processing and pattern recognition. A great effort has been made to design efficient algorithms addressing several variants…

Data Structures and Algorithms · Computer Science 2008-07-29 Dimitris Papamichail , Georgios Papamichail

Accelerating Kernel Classifiers Through Borders Mapping

Support vector machines (SVM) and other kernel techniques represent a family of powerful statistical classification methods with high accuracy and broad applicability. Because they use all or a significant portion of the training data,…

Machine Learning · Statistics 2023-01-31 Peter Mills

Faster Algorithm of String Comparison

In many applications, it is necessary to determine the string similarity. Edit distance[WF74] approach is a classic method to determine Field Similarity. A well known dynamic programming algorithm [GUS97] is used to calculate edit distance…

Data Structures and Algorithms · Computer Science 2007-05-23 Qi Xiao Yang , Sung Sam Yuan , Lu Chun , Li Zhao , Sun Peng

Challenges in Binary Classification

Binary Classification plays an important role in machine learning. For linear classification, SVM is the optimal binary classification method. For nonlinear classification, the SVM algorithm needs to complete the classification task by…

Machine Learning · Computer Science 2024-06-21 Pengbo Yang , Jian Yu

Statistical performance of support vector machines

The support vector machine (SVM) algorithm is well known to the computer learning community for its very good practical results. The goal of the present paper is to study this algorithm from a statistical perspective, using tools of…

Statistics Theory · Mathematics 2008-12-18 Gilles Blanchard , Olivier Bousquet , Pascal Massart

Proposal and study of statistical features for string similarity computation and classification

Adaptations of features commonly applied in the field of visual computing, co-occurrence matrix (COM) and run-length matrix (RLM), are proposed for the similarity computation of strings in general (words, phrases, codes and texts). The…

Machine Learning · Computer Science 2026-05-15 E. O. Rodrigues , D. Casanova , M. Teixeira , V. Pegorini , F. Favarim , E. Clua , A. Conci , Panos Liatsis

Application-Driven Near-Data Processing for Similarity Search

Similarity search is a key to a variety of applications including content-based search for images and video, recommendation systems, data deduplication, natural language processing, computer vision, databases, computational biology, and…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-07-11 Vincent T. Lee , Amrita Mazumdar , Carlo C. del Mundo , Armin Alaghi , Luis Ceze , Mark Oskin

Support vector machines/relevance vector machine for remote sensing classification: A review

Kernel-based machine learning algorithms are based on mapping data from the original input feature space to a kernel feature space of higher dimensionality to solve a linear problem in that space. Over the last decade, kernel based…

Computer Vision and Pattern Recognition · Computer Science 2011-01-18 Mahesh Pal