Related papers: Learning to Link

Data-Driven Clustering via Parameterized Lloyd's Families

Algorithms for clustering points in metric spaces is a long-studied area of research. Clustering has seen a multitude of work both theoretically, in understanding the approximation guarantees possible for many objective functions such as…

Data Structures and Algorithms · Computer Science 2019-05-27 Maria-Florina Balcan , Travis Dick , Colin White

Constrained Clustering and Multiple Kernel Learning without Pairwise Constraint Relaxation

Clustering under pairwise constraints is an important knowledge discovery tool that enables the learning of appropriate kernels or distance metrics to improve clustering performance. These pairwise constraints, which come in the form of…

Machine Learning · Computer Science 2022-03-24 Benedikt Boecking , Vincent Jeanselme , Artur Dubrawski

A Short Survey on Data Clustering Algorithms

With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial…

Data Structures and Algorithms · Computer Science 2015-12-01 Ka-Chun Wong

Exploring dual information in distance metric learning for clustering

Distance metric learning algorithms aim to appropriately measure similarities and distances between data points. In the context of clustering, metric learning is typically applied with the assist of side-information provided by experts,…

Machine Learning · Computer Science 2021-05-27 Rodrigo Randel , Daniel Aloise , Alain Hertz

Categorical Data Clustering via Value Order Estimated Distance Metric Learning

Clustering is a popular machine learning technique for data mining that can process and analyze datasets to automatically reveal sample distribution patterns. Since the ubiquitous categorical data naturally lack a well-defined metric space…

Machine Learning · Computer Science 2025-09-01 Yiqun Zhang , Mingjie Zhao , Hong Jia , Yang Lu , Mengke Li , Yiu-ming Cheung

Clustering For Point Pattern Data

Clustering is one of the most common unsupervised learning tasks in machine learning and data mining. Clustering algorithms have been used in a plethora of applications across several scientific fields. However, there has been limited…

Machine Learning · Computer Science 2017-02-09 Quang N. Tran , Ba-Ngu Vo , Dinh Phung , Ba-Tuong Vo

A Rapid Review of Clustering Algorithms

Clustering algorithms aim to organize data into groups or clusters based on the inherent patterns and similarities within the data. They play an important role in today's life, such as in marketing and e-commerce, healthcare, data…

Machine Learning · Computer Science 2024-01-17 Hui Yin , Amir Aryani , Stephen Petrie , Aishwarya Nambissan , Aland Astudillo , Shengyuan Cao

An Overview on Clustering Methods

Clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. Clustering is the process of grouping similar…

Data Structures and Algorithms · Computer Science 2012-05-08 T. Soni Madhulatha

Issues,Challenges and Tools of Clustering Algorithms

Clustering is an unsupervised technique of Data Mining. It means grouping similar objects together and separating the dissimilar ones. Each object in the data set is assigned a class label in the clustering process using a distance measure.…

Information Retrieval · Computer Science 2011-10-13 Parul Agarwal , M. Afshar Alam , Ranjit Biswas

A Unifying Family of Data-Adaptive Partitioning Algorithms

Clustering algorithms remain valuable tools for grouping and summarizing the most important aspects of data. Example areas where this is the case include image segmentation, dimension reduction, signals analysis, model order reduction,…

Numerical Analysis · Mathematics 2024-12-24 Guy B. Oldaker , Maria Emelianenko

The Exploitation of Distance Distributions for Clustering

Although distance measures are used in many machine learning algorithms, the literature on the context-independent selection and evaluation of distance measures is limited in the sense that prior knowledge is used. In cluster analysis,…

Machine Learning · Computer Science 2021-08-24 Michael C. Thrun

Fairness, Semi-Supervised Learning, and More: A General Framework for Clustering with Stochastic Pairwise Constraints

Metric clustering is fundamental in areas ranging from Combinatorial Optimization and Data Mining, to Machine Learning and Operations Research. However, in a variety of situations we may have additional requirements or knowledge, distinct…

Machine Learning · Computer Science 2021-03-04 Brian Brubach , Darshan Chakrabarti , John P. Dickerson , Aravind Srinivasan , Leonidas Tsepenekas

Seeking the Truth Beyond the Data. An Unsupervised Machine Learning Approach

Clustering is an unsupervised machine learning methodology where unlabeled elements/objects are grouped together aiming to the construction of well-established clusters that their elements are classified according to their similarity. The…

Machine Learning · Statistics 2023-10-20 Dimitrios Saligkaras , Vasileios E. Papageorgiou

Clustering validity based on the most similarity

One basic requirement of many studies is the necessity of classifying data. Clustering is a proposed method for summarizing networks. Clustering methods can be divided into two categories named model-based approaches and algorithmic…

Machine Learning · Computer Science 2013-02-19 Raheleh Namayandeh , Farzad Didehvar , Zahra Shojaei

Clustering Plotted Data by Image Segmentation

Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a dataset as points in a metric space and compute distances to group together similar…

Machine Learning · Computer Science 2021-10-12 Tarek Naous , Srinjay Sarkar , Abubakar Abid , James Zou

Crowdsourced correlation clustering with relative distance comparisons

Crowdsourced, or human computation based clustering algorithms usually rely on relative distance comparisons, as these are easier to elicit from human workers than absolute distance information. A relative distance comparison is a statement…

Data Structures and Algorithms · Computer Science 2017-09-26 Antti Ukkonen

Semi-Supervised Constrained Clustering: An In-Depth Overview, Ranked Taxonomy and Future Research Directions

Clustering is a well-known unsupervised machine learning approach capable of automatically grouping discrete sets of instances with similar characteristics. Constrained clustering is a semi-supervised extension to this process that can be…

Machine Learning · Computer Science 2023-03-02 Germán González-Almagro , Daniel Peralta , Eli De Poorter , José-Ramón Cano , Salvador García

Meta-Learning to Cluster

Clustering is one of the most fundamental and wide-spread techniques in exploratory data analysis. Yet, the basic approach to clustering has not really changed: a practitioner hand-picks a task-specific clustering loss to optimize and fit…

Machine Learning · Computer Science 2019-11-01 Yibo Jiang , Nakul Verma

A Tutorial on Distance Metric Learning: Mathematical Foundations, Algorithms, Experimental Analysis, Prospects and Challenges (with Appendices on Mathematical Background and Detailed Algorithms Explanation)

Distance metric learning is a branch of machine learning that aims to learn distances from the data, which enhances the performance of similarity-based algorithms. This tutorial provides a theoretical background and foundations on this…

Machine Learning · Computer Science 2020-08-20 Juan Luis Suárez-Díaz , Salvador García , Francisco Herrera

A shortest-path based clustering algorithm for joint human-machine analysis of complex datasets

Clustering is a technique for the analysis of datasets obtained by empirical studies in several disciplines with a major application for biomedical research. Essentially, clustering algorithms are executed by machines aiming at finding…

Quantitative Methods · Quantitative Biology 2024-09-30 Diego Ulisse Pizzagalli , Santiago Fernandez Gonzalez , Rolf Krause