English
Related papers

Related papers: A framework for benchmarking clustering algorithms

200 papers

Note: A revised version of this is now published. Please cite and read (it's open access): Van Mechelen, I., Boulesteix, A.-L., Dangl, R., Dean, N., Hennig, C., Leisch, F., Steinley, D., Warrens, M. J. (2023). A white paper on good research…

Comprehensive benchmarking of clustering algorithms is rendered difficult by two key factors: (i)~the elusiveness of a unique mathematical definition of this unsupervised learning approach and (ii)~dependencies between the generating models…

Neural and Evolutionary Computing · Computer Science 2022-01-11 Cameron Shand , Richard Allmendinger , Julia Handl , Andrew Webb , John Keane

Experimental evaluation is a major research methodology for investigating clustering algorithms and many other machine learning algorithms. For this purpose, a number of benchmark datasets have been widely used in the literature and their…

Machine Learning · Computer Science 2019-10-21 Tiantian Zhang , Li Zhong , Bo Yuan

Performance of clustering algorithms is evaluated with the help of accuracy metrics. There is a great diversity of clustering algorithms, which are key components of many data analysis and exploration systems. However, there exist only few…

Data Structures and Algorithms · Computer Science 2019-02-18 Artem Lutov , Mourad Khayati , Philippe Cudré-Mauroux

With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial…

Data Structures and Algorithms · Computer Science 2015-12-01 Ka-Chun Wong

Quality assessments of models in unsupervised learning and clustering verification in particular have been a long-standing problem in the machine learning research. The lack of robust and universally applicable cluster validity scores often…

Machine Learning · Statistics 2018-03-30 Luzie Helfmann , Johannes von Lindheim , Mattes Mollenhauer , Ralf Banisch

Clustering is an unsupervised technique of Data Mining. It means grouping similar objects together and separating the dissimilar ones. Each object in the data set is assigned a class label in the clustering process using a distance measure.…

Information Retrieval · Computer Science 2011-10-13 Parul Agarwal , M. Afshar Alam , Ranjit Biswas

Clustering evaluation measures are frequently used to evaluate the performance of algorithms. However, most measures are not properly normalized and ignore some information in the inherent structure of clusterings. We model the relation…

Machine Learning · Computer Science 2012-09-05 Qiaoliang Xiang , Qi Mao , Kian Ming Chai , Hai Leong Chieu , Ivor Tsang , Zhendong Zhao

Cluster analysis refers to a wide range of data analytic techniques for class discovery and is popular in many application fields. To judge the quality of a clustering result, different cluster validation procedures have been proposed in…

Methodology · Statistics 2022-01-11 Theresa Ullmann , Christian Hennig , Anne-Laure Boulesteix

Clustering algorithms aim to organize data into groups or clusters based on the inherent patterns and similarities within the data. They play an important role in today's life, such as in marketing and e-commerce, healthcare, data…

Machine Learning · Computer Science 2024-01-17 Hui Yin , Amir Aryani , Stephen Petrie , Aishwarya Nambissan , Aland Astudillo , Shengyuan Cao

The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark…

Machine Learning · Computer Science 2017-03-03 Randal S. Olson , William La Cava , Patryk Orzechowski , Ryan J. Urbanowicz , Jason H. Moore

Many cluster similarity indices are used to evaluate clustering algorithms, and choosing the best one for a particular task remains an open problem. We demonstrate that this problem is crucial: there are many disagreements among the…

Discrete Mathematics · Computer Science 2021-08-27 Martijn Gösgens , Alexey Tikhonov , Liudmila Prokhorenkova

Cluster analysis is one of the essential tasks in data mining and knowledge discovery. Each type of data poses unique challenges in achieving relatively efficient partitioning of the data into homogeneous groups. While the algorithms for…

Machine Learning · Computer Science 2018-12-11 Ruben A. Gevorgyan , Yenok B. Hakobyan

We review clustering as an analysis tool and the underlying concepts from an introductory perspective. What is clustering and how can clusterings be realised programmatically? How can data be represented and prepared for a clustering task?…

Machine Learning · Computer Science 2022-12-05 Jan-Oliver Felix Kapp-Joswig , Bettina G. Keller

There is a great diversity of clustering and community detection algorithms, which are key components of many data analysis and exploration systems. To the best of our knowledge, however, there does not exist yet any uniform benchmarking…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-02-04 Artem Lutov , Mourad Khayati , Philippe Cudré-Mauroux

A data analysis pipeline is a structured sequence of steps that transforms raw data into meaningful insights by integrating multiple analysis algorithms. In many practical applications, analytical findings are obtained only after data pass…

Machine Learning · Statistics 2026-05-04 Yugo Miyata , Tomohiro Shiraishi , Shuichi Nishino , Ichiro Takeuchi

Graph clustering is widely used in analysis of biological networks, social networks and etc. For over a decade many graph clustering algorithms have been published, however a comprehensive and consistent performance comparison is not…

Social and Information Networks · Computer Science 2020-05-12 Lizhen Shi , Bo Chen

Recommender systems are one of the most applied methods in machine learning and find applications in many areas, ranging from economics to the Internet of things. This article provides a general overview of modern approaches to recommender…

Information Retrieval · Computer Science 2021-09-28 Irina Beregovskaya , Mikhail Koroteev

We consider clustering in group decision making where the opinions are given by pairwise comparison matrices. In particular, the k-medoids model is suggested to classify the matrices since it has a linear programming problem formulation…

Optimization and Control · Mathematics 2025-04-17 Kolos Csaba Ágoston , Sándor Bozóki , László Csató

This paper is a chapter in the forthcoming Handbook of Cluster Analysis, Hennig et al. (2015). For definitions of basic clustering methods and some further methodology, other chapters of the Handbook are referred to. To read this version of…

Methodology · Statistics 2015-03-09 Christian Hennig
‹ Prev 1 2 3 10 Next ›