Related papers: Sparse clustering via the Deterministic Informatio…

Weighted Sparse Subspace Representation: A Unified Framework for Subspace Clustering, Constrained Clustering, and Active Learning

Spectral-based subspace clustering methods have proved successful in many challenging applications such as gene sequencing, image recognition, and motion segmentation. In this work, we first propose a novel spectral-based subspace…

Machine Learning · Statistics 2021-06-09 Hankui Peng , Nicos G. Pavlidis

Learning with Clustering Structure

We study supervised learning problems using clustering constraints to impose structure on either features or samples, seeking to help both prediction and interpretation. The problem of clustering features arises naturally in text…

Machine Learning · Computer Science 2016-09-20 Vincent Roulet , Fajwel Fogel , Alexandre d'Aspremont , Francis Bach

Clustering Mixed Numeric and Categorical Data: A Cluster Ensemble Approach

Clustering is a widely used technique in data mining applications for discovering patterns in underlying data. Most traditional clustering algorithms are limited to handling datasets that contain either numeric or categorical attributes.…

Artificial Intelligence · Computer Science 2007-05-23 Zengyou He , Xiaofei Xu , Shengchun Deng

Simple and Scalable Sparse k-means Clustering via Feature Ranking

Clustering, a fundamental activity in unsupervised learning, is notoriously difficult when the feature space is high-dimensional. Fortunately, in many realistic scenarios, only a handful of features are relevant in distinguishing clusters.…

Machine Learning · Statistics 2020-10-23 Zhiyue Zhang , Kenneth Lange , Jason Xu

Identification of relevant subtypes via preweighted sparse clustering

Cluster analysis methods are used to identify homogeneous subgroups in a data set. In biomedical applications, one frequently applies cluster analysis in order to identify biologically interesting subgroups. In particular, one may wish to…

Methodology · Statistics 2016-09-23 Sheila Gaynor , Eric Bair

A matching based clustering algorithm for categorical data

Cluster analysis is one of the essential tasks in data mining and knowledge discovery. Each type of data poses unique challenges in achieving relatively efficient partitioning of the data into homogeneous groups. While the algorithms for…

Machine Learning · Computer Science 2018-12-11 Ruben A. Gevorgyan , Yenok B. Hakobyan

A probabilistic constrained clustering for transfer learning and image category discovery

Neural network-based clustering has recently gained popularity, and in particular a constrained clustering formulation has been proposed to perform transfer learning and image category discovery using deep learning. The core idea is to…

Computer Vision and Pattern Recognition · Computer Science 2018-06-29 Yen-Chang Hsu , Zhaoyang Lv , Joel Schlosser , Phillip Odom , Zsolt Kira

A Deterministic Information Bottleneck Method for Clustering Mixed-Type Data

In this paper, we present an information-theoretic method for clustering mixed-type data, that is, data consisting of both continuous and categorical variables. The proposed approach extends the Information Bottleneck principle to…

Methodology · Statistics 2026-02-02 Efthymios Costa , Ioanna Papatsouma , Angelos Markos

Probabilistic Sparse Subspace Clustering Using Delayed Association

Discovering and clustering subspaces in high-dimensional data is a fundamental problem of machine learning with a wide range of applications in data mining, computer vision, and pattern recognition. Earlier methods divided the problem into…

Machine Learning · Statistics 2018-08-30 Maryam Jaberi , Marianna Pensky , Hassan Foroosh

Efficient Information Theoretic Clustering on Discrete Lattices

We consider the problem of clustering data that reside on discrete, low dimensional lattices. Canonical examples for this setting are found in image segmentation and key point extraction. Our solution is based on a recent approach to…

Computer Vision and Pattern Recognition · Computer Science 2013-10-29 Christian Bauckhage , Kristian Kersting

A Theoretical Analysis of Noisy Sparse Subspace Clustering on Dimensionality-Reduced Data

Subspace clustering is the problem of partitioning unlabeled data points into a number of clusters so that data points within one cluster lie approximately on a low-dimensional linear subspace. In many practical scenarios, the…

Machine Learning · Statistics 2019-01-24 Yining Wang , Yu-Xiang Wang , Aarti Singh

Analysis of Sparse Subspace Clustering: Experiments and Random Projection

Clustering can be defined as the process of assembling objects into a number of groups whose elements are similar to each other in some manner. As a technique that is used in many domains, such as face clustering, plant categorization,…

Machine Learning · Computer Science 2022-04-05 Mehmet F. Demirel , Enrico Au-Yeung

Information based clustering

In an age of increasingly large data sets, investigators in many different disciplines have turned to clustering as a tool for data analysis and exploration. Existing clustering methods, however, typically depend on several nontrivial…

Quantitative Methods · Quantitative Biology 2009-11-11 Noam Slonim , Gurinder Singh Atwal , Gasper Tkacik , William Bialek

Clustering on Multiple Incomplete Datasets via Collective Kernel Learning

Multiple datasets containing different types of features may be available for a given task. For instance, users' profiles can be used to group users for recommendation systems. In addition, a model can also use users' historical behaviors…

Machine Learning · Computer Science 2016-05-10 Weixiang Shao , Xiaoxiao Shi , Philip S. Yu

A shortest-path based clustering algorithm for joint human-machine analysis of complex datasets

Clustering is a technique for the analysis of datasets obtained by empirical studies in several disciplines with a major application for biomedical research. Essentially, clustering algorithms are executed by machines aiming at finding…

Quantitative Methods · Quantitative Biology 2024-09-30 Diego Ulisse Pizzagalli , Santiago Fernandez Gonzalez , Rolf Krause

Leachable Component Clustering

Clustering attempts to partition data instances into several distinctive groups, while the similarities among data belonging to the common partition can be principally reserved. Furthermore, incomplete data frequently occurs in many…

Machine Learning · Computer Science 2022-08-30 Miao Cheng , Xinge You

Distributed clustering in partially overlapping feature spaces

We introduce and address a novel distributed clustering problem where each participant has a private dataset containing only a subset of all available features, and some features are included in multiple datasets. This scenario occurs in…

Data Structures and Algorithms · Computer Science 2025-10-14 Alessio Maritan , Luca Schenato

Sparse Convex Clustering

Convex clustering, a convex relaxation of k-means clustering and hierarchical clustering, has drawn recent attentions since it nicely addresses the instability issue of traditional nonconvex clustering methods. Although its computational…

Methodology · Statistics 2019-01-01 Binhuan Wang , Yilong Zhang , Will Wei Sun , Yixin Fang

SACA: Selective Attention-Based Clustering Algorithm

Clustering algorithms are fundamental tools across many fields, with density-based methods offering particular advantages in identifying arbitrarily shaped clusters and handling noise. However, their effectiveness is often limited by the…

Machine Learning · Computer Science 2025-12-01 Meysam Shirdel Bilehsavar , Razieh Ghaedi , Samira Seyed Taheri , Xinqi Fan , Christian O'Reilly

Sparse Clustering of Functional Data

We consider the problem of clustering functional data while jointly selecting the most relevant features for classification. This problem has never been tackled before in the functional data context, and it requires a proper definition of…

Methodology · Statistics 2015-01-21 Davide Floriello , Valeria Vitelli