English
Related papers

Related papers: Newer method of string comparison: the Modified Mo…

200 papers

Contrast pattern mining (CPM) aims to discover patterns whose support increases significantly from a background dataset compared to a target dataset. CPM is particularly useful for characterising changes in evolving systems, e.g., in…

Networking and Internet Architecture · Computer Science 2020-12-01 Elaheh AlipourChavary , Sarah M. Erfani , Christopher Leckie

In many applications, it is necessary to determine the string similarity. Edit distance[WF74] approach is a classic method to determine Field Similarity. A well known dynamic programming algorithm [GUS97] is used to calculate edit distance…

Data Structures and Algorithms · Computer Science 2007-05-23 Qi Xiao Yang , Sung Sam Yuan , Lu Chun , Li Zhao , Sun Peng

Contrast pattern mining (CPM) is an important and popular subfield of data mining. Traditional sequential patterns cannot describe the contrast information between different classes of data, while contrast patterns involving the concept of…

Databases · Computer Science 2022-09-28 Yao Chen , Wensheng Gan , Yongdong Wu , Philip S. Yu

In domain adaptation, covariate shift and label shift problems are two distinct and complementary tasks. In covariate shift adaptation where the differences in data distribution arise from variations in feature probabilities, existing…

Machine Learning · Statistics 2023-12-13 Hongwei Wen , Annika Betken , Hanyuan Hang

The Burrows-Wheeler transform (BWT) is a well studied text transformation widely used in data compression and text indexing. The BWT of two strings can also provide similarity measures between them, based on the observation that the more…

Data Structures and Algorithms · Computer Science 2020-09-10 Felipe A. Louza , Guilherme P. Telles , Simon Gog , Liang Zhao

We present an online algorithm to deal with pattern matching in strings. The problem we investigate is commonly known as string matching with mismatches in which the objective is to report the number of characters that match when a pattern…

Data Structures and Algorithms · Computer Science 2016-03-11 Vinodprasad P

In the realm of patent document analysis, assessing semantic similarity between phrases presents a significant challenge, notably amplifying the inherent complexities of Cooperative Patent Classification (CPC) research. Firstly, this study…

Computation and Language · Computer Science 2024-01-17 Liqiang Yu , Bo Liu , Qunwei Lin , Xinyu Zhao , Chang Che

Today, with the emergence of semantic web technologies and increasing of information quantity, searching for information based on the semantic web has become a fertile area of research. For this reason, a large number of studies are…

Computer Vision and Pattern Recognition · Computer Science 2021-10-05 Noreddine Gherabi , Abdelhadi Daoui , Abderrahim Marzouk

Computing string or sequence alignments is a classical method of comparing strings and has applications in many areas of computing, such as signal processing and bioinformatics. Semi-local string alignment is a recent generalisation of this…

Data Structures and Algorithms · Computer Science 2009-03-23 Peter Krusche , Alexander Tiskin

We present a supervised learning algorithm for text categorization which has brought the team of authors the 2nd place in the text categorization division of the 2012 Cybersecurity Data Mining Competition (CDMC'2012) and a 3rd prize…

Information Retrieval · Computer Science 2013-07-11 Hubert Haoyang Duan , Vladimir Pestov , Varun Singla

This paper addresses an important problem in Example-Based Machine Translation (EBMT), namely how to measure similarity between a sentence fragment and a set of stored examples. A new method is proposed that measures similarity according to…

cmp-lg · Computer Science 2008-02-03 Lambros Cranias , Harris Papageorgiou , Stelios Piperidis

We present a novel feature matching algorithm that systematically utilizes the geometric properties of features such as position, scale, and orientation, in addition to the conventional descriptor vectors. In challenging scenes with the…

Computer Vision and Pattern Recognition · Computer Science 2017-01-23 Sehyung Lee , Jongwoo Lim , Il Hong Suh

In 2020, Peter Larsen reported flaws in the methods for centrosymmetry parameter computation in the existing molecular dynamics and analysis packages. He proposed an intuitive an mathematically rigorous formulation for centrosymmetry…

Computational Physics · Physics 2026-05-22 Vasily V. Pisarev

In this paper we present $LCSk$++: a new metric for measuring the similarity of long strings, and provide an algorithm for its efficient computation. With ever increasing size of strings occuring in practice, e.g. large genomes of plants…

Data Structures and Algorithms · Computer Science 2019-08-27 Filip Pavetić , Goran Žužić , Mile Šikić

String matching is a fundamental problem in computer science, with critical applications in text retrieval, bioinformatics, and data analysis. Among the numerous solutions that have emerged for this problem in recent decades,…

Data Structures and Algorithms · Computer Science 2025-03-10 Simone Faro , Arianna Pavone , Caterina Viola

We investigate the performance on phoneme categorization and phoneme and word segmentation of several self-supervised learning (SSL) methods based on Contrastive Predictive Coding (CPC). Our experiments show that with the existing…

Poetry is one of the most important art forms of human languages. Recently many studies have focused on incorporating some linguistic features of poetry, such as style and sentiment, into its understanding or generation system. However,…

Computation and Language · Computer Science 2021-06-04 Wenhao Li , Fanchao Qi , Maosong Sun , Xiaoyuan Yi , Jiarui Zhang

We put forth a new string matching algorithm which matches the pattern from neither the left nor the right end, instead a special position. Comparing with the Knuth-Morris-Pratt algorithm and the Boyer-Moore algorithm, the new algorithm is…

Data Structures and Algorithms · Computer Science 2014-01-29 Zhengjun Cao , Lihua Liu

Dot plots are a standard method for local comparison of biological sequences. In a dot plot, a substring to substring distance is computed for all pairs of fixed-size windows in the input strings. Commonly, the Hamming distance is used…

Data Structures and Algorithms · Computer Science 2009-09-11 Peter Krusche , Alexander Tiskin

Adaptations of features commonly applied in the field of visual computing, co-occurrence matrix (COM) and run-length matrix (RLM), are proposed for the similarity computation of strings in general (words, phrases, codes and texts). The…

Machine Learning · Computer Science 2026-05-15 E. O. Rodrigues , D. Casanova , M. Teixeira , V. Pegorini , F. Favarim , E. Clua , A. Conci , Panos Liatsis
‹ Prev 1 2 3 10 Next ›