English
Related papers

Related papers: The Extended Edit Distance Metric

200 papers

The previous decade has brought a remarkable increase of the interest in applications that deal with querying and mining of time series data. Many of the research efforts in this context have focused on introducing new representation…

Artificial Intelligence · Computer Science 2015-03-17 Xiaoyue Wang , Hui Ding , Goce Trajcevski , Peter Scheuermann , Eamonn Keogh

A time series is a sequence of data items; typical examples are videos, stock ticker data, or streams of temperature measurements. Quite some research has been devoted to comparing and indexing simple time series, i.e., time series where…

Computational Complexity · Computer Science 2018-06-04 Jörg P. Bachmann , Johann-Christoph Freytag , Benjamin Hauskeller , Nicole Schweikardt

Time series similarity measures are highly relevant in a wide range of emerging applications including training machine learning models, classification, and predictive modeling. Standard similarity measures for time series most often…

Machine Learning · Computer Science 2021-01-22 Lucas Cassiel Jacaruso

Measuring inter-dataset similarity is an important task in machine learning and data mining with various use cases and applications. Existing methods for measuring inter-dataset similarity are computationally expensive, limited, or…

Machine Learning · Computer Science 2025-05-06 Muhammad Rajabinasab , Anton D. Lautrup , Arthur Zimek

The concepts of similarity and distance are crucial in data mining. We consider the problem of defining the distance between two data sets by comparing summary statistics computed from the data sets. The initial definition of our distance…

Data Structures and Algorithms · Computer Science 2019-02-05 Nikolaj Tatti

We survey a new area of parameter-free similarity distance measures useful in data-mining, pattern recognition, learning and automatic semantics extraction. Given a family of distances on a set of objects, a distance is universal up to a…

Information Retrieval · Computer Science 2007-05-23 Paul Vitanyi

We survey the emerging area of compression-based, parameter-free, similarity distance measures useful in data-mining, pattern recognition, learning and automatic semantics extraction. Given a family of distances on a set of objects, a…

Computer Vision and Pattern Recognition · Computer Science 2007-05-23 Rudi Cilibrasi , Paul Vitanyi

Measuring the distance between data points is fundamental to many statistical techniques, such as dimension reduction or clustering algorithms. However, improvements in data collection technologies has led to a growing versatility of…

Methodology · Statistics 2022-06-20 George Bolt , Simón Lunagómez , Christopher Nemeth

The most useful data mining primitives are distance measures. With an effective distance measure, it is possible to perform classification, clustering, anomaly detection, segmentation, etc. For single-event time series Euclidean Distance…

Machine Learning · Computer Science 2022-12-14 Audrey Der , Chin-Chia Michael Yeh , Renjie Wu , Junpeng Wang , Yan Zheng , Zhongfang Zhuang , Liang Wang , Wei Zhang , Eamonn Keogh

Metric search commonly involves finding objects similar to a given sample object. We explore a generalization, where the desired result is a fair tradeoff between multiple query objects. This builds on previous results on complex queries,…

Data Structures and Algorithms · Computer Science 2021-08-10 Magnus Lie Hetland , Halvard Hummel

Recent literature has shown that symbolic data, such as text and graphs, is often better represented by points on a curved manifold, rather than in Euclidean space. However, geometrical operations on manifolds are generally more complicated…

Machine Learning · Computer Science 2019-02-06 Max Aalto , Nakul Verma

With the continued digitization of societal processes, we are seeing an explosion in available data. This is referred to as big data. In a research setting, three aspects of the data are often viewed as the main sources of challenges when…

Databases · Computer Science 2022-05-24 Lu Chen , Yunjun Gao , Xuan Song , Zheng Li , Yifan Zhu , Xiaoye Miao , Christian S. Jensen

This work briefly explores the possibility of approximating spatial distance (alternatively, similarity) between data points using the Isolation Forest method envisioned for outlier detection. The logic is similar to that of isolation: the…

Machine Learning · Statistics 2019-11-26 David Cortes

Paraphrase plagiarism identification represents a very complex task given that plagiarized texts are intentionally modified through several rewording techniques. Accordingly, this paper introduces two new measures for evaluating the…

A geometric graph is a combinatorial graph, endowed with a geometry that is inherited from its embedding in a Euclidean space. Formulation of a meaningful measure of (dis-)similarity in both the combinatorial and geometric structures of two…

Computational Geometry · Computer Science 2022-09-27 Sushovan Majhi , Carola Wenk

The main motivation of this paper is to introduce the permutation Jensen-Shannon distance, a symbolic tool able to quantify the degree of similarity between two arbitrary time series. This quantifier results from the fusion of two concepts,…

Data Analysis, Statistics and Probability · Physics 2022-04-20 Luciano Zunino , Felipe Olivares , Haroldo V. Ribeiro , Osvaldo A. Rosso

This article provides an overview on the statistical modeling of complex data as increasingly encountered in modern data analysis. It is argued that such data can often be described as elements of a metric space that satisfies certain…

Methodology · Statistics 2024-02-28 Paromita Dubey , Yaqing Chen , Hans-Georg Müller

Measuring similarity is a basic task in information retrieval, and now often a building-block for more complex arguments about cultural change. But do measures of textual similarity and distance really correspond to evidence about cultural…

Computation and Language · Computer Science 2018-07-03 Ted Underwood

The similarity search problem is one of the main problems in time series data mining. Traditionally, this problem was tackled by sequentially comparing the given query against all the time series in the database, and returning all the time…

Databases · Computer Science 2013-01-25 Muhammad Marwan Muhammad Fuad , Pierre-François Marteau

Similarity search finds objects that are similar to a given query object based on a similarity metric. As the amount and variety of data continue to grow, similarity search in metric spaces has gained significant attention. Metric spaces…

Databases · Computer Science 2024-10-08 Yifan Zhu , Chengyang Luo , Tang Qian , Lu Chen , Yunjun Gao , Baihua Zheng
‹ Prev 1 2 3 10 Next ›