Related papers: Classifying Clustering Schemes

Persistent Clustering and a Theorem of J. Kleinberg

We construct a framework for studying clustering algorithms, which includes two key ideas: persistence and functoriality. The first encodes the idea that the output of a clustering scheme should carry a multiresolution structure, the second…

Machine Learning · Statistics 2008-08-19 Gunnar Carlsson , Facundo Memoli

Issues,Challenges and Tools of Clustering Algorithms

Clustering is an unsupervised technique of Data Mining. It means grouping similar objects together and separating the dissimilar ones. Each object in the data set is assigned a class label in the clustering process using a distance measure.…

Information Retrieval · Computer Science 2011-10-13 Parul Agarwal , M. Afshar Alam , Ranjit Biswas

A Computational Theory and Semi-Supervised Algorithm for Clustering

A computational theory for clustering and a semi-supervised clustering algorithm is presented. Clustering is defined to be the obtainment of groupings of data such that each group contains no anomalies with respect to a chosen grouping…

Machine Learning · Computer Science 2025-07-17 Nassir Mohammad

To Cluster, or Not to Cluster: An Analysis of Clusterability Methods

Clustering is an essential data mining tool that aims to discover inherent cluster structure in data. For most applications, applying clustering is only appropriate when cluster structure is present. As such, the study of clusterability,…

Machine Learning · Statistics 2018-10-30 A. Adolfsson , M. Ackerman , N. C. Brownstein

Clustering validity based on the most similarity

One basic requirement of many studies is the necessity of classifying data. Clustering is a proposed method for summarizing networks. Clustering methods can be divided into two categories named model-based approaches and algorithmic…

Machine Learning · Computer Science 2013-02-19 Raheleh Namayandeh , Farzad Didehvar , Zahra Shojaei

Model-Based Clustering of Functional Data Via Random Projection Ensembles

Clustering functional data is a challenging task due to intrinsic infinite-dimensionality and the need for stable, data-adaptive partitioning. In this work, we propose a clustering framework based on Random Projections, which simultaneously…

Methodology · Statistics 2025-12-18 Matteo Mori , Laura Anderlucci

Clustering Sets of Functional Data by Similarity in Law

We introduce a new clustering method for the classification of functional data sets by their probabilistic law, that is, a procedure that aims to assign data sets to the same cluster if and only if the data were generated with the same…

Methodology · Statistics 2023-12-29 Antonio Galves , Fernando Najman , Marcela Svarc , Claudia D. Vargas

Algorithms and Complexity of Range Clustering

We introduce a novel criterion in clustering that seeks clusters with limited range of values associated with each cluster's elements. In clustering or classification the objective is to partition a set of objects into subsets, called…

Data Structures and Algorithms · Computer Science 2018-05-15 Dorit S. Hochbaum

Quartile Clustering: A quartile based technique for Generating Meaningful Clusters

Clustering is one of the main tasks in exploratory data analysis and descriptive statistics where the main objective is partitioning observations in groups. Clustering has a broad range of application in varied domains like climate,…

Databases · Computer Science 2012-03-20 Saptarsi Goswami , Amlan Chakrabarti

The cluster structure function

For each partition of a data set into a given number of parts there is a partition such that every part is as much as possible a good model (an "algorithmic sufficient statistic") for the data in that part. Since this can be done for every…

Machine Learning · Computer Science 2022-10-17 Andrew R. Cohen , Paul M. B. Vitányi

Learning with Clustering Structure

We study supervised learning problems using clustering constraints to impose structure on either features or samples, seeking to help both prediction and interpretation. The problem of clustering features arises naturally in text…

Machine Learning · Computer Science 2016-09-20 Vincent Roulet , Fajwel Fogel , Alexandre d'Aspremont , Francis Bach

A Rapid Review of Clustering Algorithms

Clustering algorithms aim to organize data into groups or clusters based on the inherent patterns and similarities within the data. They play an important role in today's life, such as in marketing and e-commerce, healthcare, data…

Machine Learning · Computer Science 2024-01-17 Hui Yin , Amir Aryani , Stephen Petrie , Aishwarya Nambissan , Aland Astudillo , Shengyuan Cao

Fair Labeled Clustering

Numerous algorithms have been produced for the fundamental problem of clustering under many different notions of fairness. Perhaps the most common family of notions currently studied is group fairness, in which proportional group…

Machine Learning · Computer Science 2023-06-06 Seyed A. Esmaeili , Sharmila Duppala , John P. Dickerson , Brian Brubach

When is Clustering Perturbation Robust?

Clustering is a fundamental data mining tool that aims to divide data into groups of similar items. Generally, intuition about clustering reflects the ideal case -- exact data sets endowed with flawless dissimilarity between individual…

Machine Learning · Computer Science 2016-01-25 Margareta Ackerman , Jarrod Moore

Review of Clustering Methods for Functional Data

Functional data clustering is to identify heterogeneous morphological patterns in the continuous functions underlying the discrete measurements/observations. Application of functional data clustering has appeared in many publications across…

Methodology · Statistics 2022-10-04 Mimi Zhang , Andrew Parnell

Cluster Explanation via Polyhedral Descriptions

Clustering is an unsupervised learning problem that aims to partition unlabelled data points into groups with similar features. Traditional clustering algorithms provide limited insight into the groups they find as their main focus is…

Machine Learning · Computer Science 2022-10-18 Connor Lawless , Oktay Gunluk

Consistency constraints for overlapping data clustering

We examine overlapping clustering schemes with functorial constraints, in the spirit of Carlsson--Memoli. This avoids issues arising from the chaining required by partition-based methods. Our principal result shows that any clustering…

Machine Learning · Computer Science 2016-08-16 Jared Culbertson , Dan P. Guralnik , Jakob Hansen , Peter F. Stiller

Selecting the number of clusters, clustering models, and algorithms. A unifying approach based on the quadratic discriminant score

Cluster analysis requires many decisions: the clustering method and the implied reference model, the number of clusters and, often, several hyper-parameters and algorithms' tunings. In practice, one produces several partitions, and a final…

Machine Learning · Statistics 2023-08-14 Luca Coraggio , Pietro Coretto

Incremental Clustering: The Case for Extra Clusters

The explosion in the amount of data available for analysis often necessitates a transition from batch to incremental clustering methods, which process one element at a time and typically store only a small subset of the data. In this paper,…

Machine Learning · Computer Science 2014-06-26 Margareta Ackerman , Sanjoy Dasgupta

How many clusters? An information theoretic perspective

Clustering provides a common means of identifying structure in complex data, and there is renewed interest in clustering as a tool for the analysis of large data sets in many fields. A natural question is how many clusters are appropriate…

Data Analysis, Statistics and Probability · Physics 2007-05-23 Susanne Still , William Bialek