Related papers: Predictive clustering

Predictive K-means with local models

Supervised classification can be effective for prediction but sometimes weak on interpretability or explainability (XAI). Clustering, on the other hand, tends to isolate categories or profiles that can be meaningful but there is no…

Machine Learning · Computer Science 2021-04-27 Vincent Lemaire , Oumaima Alaoui Ismaili , Antoine Cornuéjols , Dominique Gay

A clustering approach for pairwise comparison matrices

We consider clustering in group decision making where the opinions are given by pairwise comparison matrices. In particular, the k-medoids model is suggested to classify the matrices since it has a linear programming problem formulation…

Optimization and Control · Mathematics 2025-04-17 Kolos Csaba Ágoston , Sándor Bozóki , László Csató

The Utility of Clustering in Prediction Tasks

We explore the utility of clustering in reducing error in various prediction tasks. Previous work has hinted at the improvement in prediction accuracy attributed to clustering algorithms if used to pre-process the data. In this work we more…

Machine Learning · Computer Science 2015-09-22 Shubhendu Trivedi , Zachary A. Pardos , Neil T. Heffernan

Estimating the number of clusters using cross-validation

Many clustering methods, including k-means, require the user to specify the number of clusters as an input parameter. A variety of methods have been devised to choose the number of clusters automatically, but they often rely on strong…

Methodology · Statistics 2017-02-10 Wei Fu , Patrick O. Perry

Learning with Clustering Structure

We study supervised learning problems using clustering constraints to impose structure on either features or samples, seeking to help both prediction and interpretation. The problem of clustering features arises naturally in text…

Machine Learning · Computer Science 2016-09-20 Vincent Roulet , Fajwel Fogel , Alexandre d'Aspremont , Francis Bach

The K-modes algorithm for clustering

Many clustering algorithms exist that estimate a cluster centroid, such as K-means, K-medoids or mean-shift, but no algorithm seems to exist that clusters data by returning exactly K meaningful modes. We propose a natural definition of a…

Machine Learning · Computer Science 2013-04-25 Miguel Á. Carreira-Perpiñán , Weiran Wang

Clustering -- Basic concepts and methods

We review clustering as an analysis tool and the underlying concepts from an introductory perspective. What is clustering and how can clusterings be realised programmatically? How can data be represented and prepared for a clustering task?…

Machine Learning · Computer Science 2022-12-05 Jan-Oliver Felix Kapp-Joswig , Bettina G. Keller

Identifying the number of clusters for K-Means: A hypersphere density based approach

Application of K-Means algorithm is restricted by the fact that the number of clusters should be known beforehand. Previously suggested methods to solve this problem are either ad hoc or require parametric assumptions and complicated…

Machine Learning · Computer Science 2019-12-05 Sukavanan Nanjundan , Shreeviknesh Sankaran , C. R. Arjun , G. Paavai Anand

Time series clustering based on prediction accuracy of global forecasting models

In this paper, a novel method to perform model-based clustering of time series is proposed. The procedure relies on two iterative steps: (i) K global forecasting models are fitted via pooling by considering the series pertaining to each…

Machine Learning · Statistics 2023-05-02 Ángel López Oriona , Pablo Montero Manso , José Antonio Vilar Fernández

Forecasting Method for Grouped Time Series with the Use of k-Means Algorithm

The paper is focused on the forecasting method for time series groups with the use of algorithms for cluster analysis. $K$-means algorithm is suggested to be a basic one for clustering. The coordinates of the centers of clusters have been…

Machine Learning · Computer Science 2015-09-17 N. N. Astakhova , L. A. Demidova , E. V. Nikulchev

Analysis of Sparse Subspace Clustering: Experiments and Random Projection

Clustering can be defined as the process of assembling objects into a number of groups whose elements are similar to each other in some manner. As a technique that is used in many domains, such as face clustering, plant categorization,…

Machine Learning · Computer Science 2022-04-05 Mehmet F. Demirel , Enrico Au-Yeung

Learning-Augmented $k$-means Clustering

$k$-means clustering is a well-studied problem due to its wide applicability. Unfortunately, there exist strong theoretical limits on the performance of any algorithm for the $k$-means problem on worst-case inputs. To overcome this barrier,…

Machine Learning · Computer Science 2022-03-22 Jon C. Ergun , Zhili Feng , Sandeep Silwal , David P. Woodruff , Samson Zhou

Probabilistic Partitive Partitioning (PPP)

Clustering is a NP-hard problem. Thus, no optimal algorithm exists, heuristics are applied to cluster the data. Heuristics can be very resource-intensive, if not applied properly. For substantially large data sets computational efficiencies…

Databases · Computer Science 2020-03-11 Mujahid Sultan

Semi-supervised clustering methods

Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning…

Methodology · Statistics 2014-07-11 Eric Bair

Radius-Guided Post-Clustering for Shape-Aware, Scalable Refinement of k-Means Results

Traditional k-means clustering underperforms on non-convex shapes and requires the number of clusters k to be specified in advance. We propose a simple geometric enhancement: after standard k-means, each cluster center is assigned a radius…

Machine Learning · Computer Science 2025-04-30 Stefan Kober

Unsupervised classification of uncertain data objects in spatial databases using computational geometry and indexing techniques

Unsupervised classification called clustering is a process of organizing objects into groups whose members are similar in some way. Clustering of uncertain data objects is a challenge in spatial data bases. In this paper we use Probability…

Databases · Computer Science 2013-12-10 Ramachandra Rao Kurada

Discriminative k-means clustering

The k-means algorithm is a partitional clustering method. Over 60 years old, it has been successfully used for a variety of problems. The popularity of k-means is in large part a consequence of its simplicity and efficiency. In this paper…

Computer Vision and Pattern Recognition · Computer Science 2013-06-11 Ognjen Arandjelovic

Merging $K$-means with hierarchical clustering for identifying general-shaped groups

Clustering partitions a dataset such that observations placed together in a group are similar but different from those in other groups. Hierarchical and $K$-means clustering are two approaches but have different strengths and weaknesses.…

Machine Learning · Statistics 2017-12-27 Anna D. Peterson , Arka P. Ghosh , Ranjan Maitra

A penalized criterion for selecting the number of clusters for K-medians

Clustering is a usual unsupervised machine learning technique for grouping the data points into groups based upon similar features. We focus here on unsupervised clustering for contaminated data, i.e in the case where K-medians should be…

Statistics Theory · Mathematics 2024-02-28 Antoine Godichon-Baggioni , Sobihan Surendran

Spherical clustering in detection of groups of concomitant extremes

There is growing empirical evidence that spherical $k$-means clustering performs well at identifying groups of concomitant extremes in high dimensions, thereby leading to sparse models. We provide one of the first theoretical results…

Statistics Theory · Mathematics 2022-03-21 V. Fomichov , J. Ivanovs