Related papers: Explainable cluster analysis: a bagging approach

Element-centric clustering comparison unifies overlaps and hierarchy

Clustering is one of the most universal approaches for understanding complex data. A pivotal aspect of clustering analysis is quantitatively comparing clusterings; clustering comparison is the basis for many tasks such as clustering…

Machine Learning · Statistics 2019-06-13 Alexander J. Gates , Ian B. Wood , William P. Hetrick , Yong-Yeol Ahn

Feature Screening in Large Scale Cluster Analysis

We propose a novel methodology for feature screening in clustering massive datasets, in which both the number of features and the number of observations can potentially be very large. Taking advantage of a fusion penalization based convex…

Methodology · Statistics 2017-10-05 Trambak Banerjee , Gourab Mukherjee , Peter Radchenko

Towards Explainable Clustering: A Constrained Declarative based Approach

The domain of explainable AI is of interest in all Machine Learning fields, and it is all the more important in clustering, an unsupervised task whose result must be validated by a domain expert. We aim at finding a clustering that has high…

Artificial Intelligence · Computer Science 2024-03-28 Mathieu Guilbert , Christel Vrain , Thi-Bich-Hanh Dao

Quartile Clustering: A quartile based technique for Generating Meaningful Clusters

Clustering is one of the main tasks in exploratory data analysis and descriptive statistics where the main objective is partitioning observations in groups. Clustering has a broad range of application in varied domains like climate,…

Databases · Computer Science 2012-03-20 Saptarsi Goswami , Amlan Chakrabarti

Interpretable Clustering with the Distinguishability Criterion

Cluster analysis is a popular unsupervised learning tool used in many disciplines to identify heterogeneous sub-populations within a sample. However, validating cluster analysis results and determining the number of clusters in a data set…

Machine Learning · Statistics 2024-04-26 Ali Turfah , Xiaoquan Wen

Interpretable Clustering Ensemble

Clustering ensemble has emerged as an important research topic in the field of machine learning. Although numerous methods have been proposed to improve clustering quality, most existing approaches overlook the need for interpretability in…

Machine Learning · Computer Science 2025-06-09 Hang Lv , Lianyu Hu , Mudi Jiang , Xinying Liu , Zengyou He

To Cluster, or Not to Cluster: An Analysis of Clusterability Methods

Clustering is an essential data mining tool that aims to discover inherent cluster structure in data. For most applications, applying clustering is only appropriate when cluster structure is present. As such, the study of clusterability,…

Machine Learning · Statistics 2018-10-30 A. Adolfsson , M. Ackerman , N. C. Brownstein

Deep Descriptive Clustering

Recent work on explainable clustering allows describing clusters when the features are interpretable. However, much modern machine learning focuses on complex data such as images, text, and graphs where deep learning is used but the raw…

Machine Learning · Computer Science 2021-05-26 Hongjing Zhang , Ian Davidson

Tree Index: A New Cluster Evaluation Technique

We introduce a cluster evaluation technique called Tree Index. Our Tree Index algorithm aims at describing the structural information of the clustering rather than the quantitative format of cluster-quality indexes (where the representation…

Machine Learning · Computer Science 2020-03-25 A. H. Beg , Md Zahidul Islam , Vladimir Estivill-Castro

Selecting the number of clusters, clustering models, and algorithms. A unifying approach based on the quadratic discriminant score

Cluster analysis requires many decisions: the clustering method and the implied reference model, the number of clusters and, often, several hyper-parameters and algorithms' tunings. In practice, one produces several partitions, and a final…

Machine Learning · Statistics 2023-08-14 Luca Coraggio , Pietro Coretto

Weighted Clustering Ensemble: A Review

Clustering ensemble, or consensus clustering, has emerged as a powerful tool for improving both the robustness and the stability of results from individual clustering methods. Weighted clustering ensemble arises naturally from clustering…

Computer Vision and Pattern Recognition · Computer Science 2021-12-14 Mimi Zhang

Simple and Scalable Sparse k-means Clustering via Feature Ranking

Clustering, a fundamental activity in unsupervised learning, is notoriously difficult when the feature space is high-dimensional. Fortunately, in many realistic scenarios, only a handful of features are relevant in distinguishing clusters.…

Machine Learning · Statistics 2020-10-23 Zhiyue Zhang , Kenneth Lange , Jason Xu

Explainable $k$-Means and $k$-Medians Clustering

Clustering is a popular form of unsupervised learning for geometric data. Unfortunately, many clustering algorithms lead to cluster assignments that are hard to explain, partially because they depend on all the features of the data in a…

Machine Learning · Computer Science 2020-09-23 Sanjoy Dasgupta , Nave Frost , Michal Moshkovitz , Cyrus Rashtchian

XClusters: Explainability-first Clustering

We study the problem of explainability-first clustering where explainability becomes a first-class citizen for clustering. Previous clustering approaches use decision trees for explanation, but only after the clustering is completed. In…

Machine Learning · Computer Science 2022-12-13 Hyunseung Hwang , Steven Euijong Whang

A shortest-path based clustering algorithm for joint human-machine analysis of complex datasets

Clustering is a technique for the analysis of datasets obtained by empirical studies in several disciplines with a major application for biomedical research. Essentially, clustering algorithms are executed by machines aiming at finding…

Quantitative Methods · Quantitative Biology 2024-09-30 Diego Ulisse Pizzagalli , Santiago Fernandez Gonzalez , Rolf Krause

Incremental Clustering: The Case for Extra Clusters

The explosion in the amount of data available for analysis often necessitates a transition from batch to incremental clustering methods, which process one element at a time and typically store only a small subset of the data. In this paper,…

Machine Learning · Computer Science 2014-06-26 Margareta Ackerman , Sanjoy Dasgupta

On Hyperparameter Search in Cluster Ensembles

Quality assessments of models in unsupervised learning and clustering verification in particular have been a long-standing problem in the machine learning research. The lack of robust and universally applicable cluster validity scores often…

Machine Learning · Statistics 2018-03-30 Luzie Helfmann , Johannes von Lindheim , Mattes Mollenhauer , Ralf Banisch

Clustering Mixed Numeric and Categorical Data: A Cluster Ensemble Approach

Clustering is a widely used technique in data mining applications for discovering patterns in underlying data. Most traditional clustering algorithms are limited to handling datasets that contain either numeric or categorical attributes.…

Artificial Intelligence · Computer Science 2007-05-23 Zengyou He , Xiaofei Xu , Shengchun Deng

Balancing the Tradeoff Between Clustering Value and Interpretability

Graph clustering groups entities -- the vertices of a graph -- based on their similarity, typically using a complex distance function over a large number of features. Successful integration of clustering approaches in automated…

Machine Learning · Statistics 2020-02-03 Sandhya Saisubramanian , Sainyam Galhotra , Shlomo Zilberstein

Clustering validity based on the most similarity

One basic requirement of many studies is the necessity of classifying data. Clustering is a proposed method for summarizing networks. Clustering methods can be divided into two categories named model-based approaches and algorithmic…

Machine Learning · Computer Science 2013-02-19 Raheleh Namayandeh , Farzad Didehvar , Zahra Shojaei