Related papers: A Deterministic Information Bottleneck Method for …

Multivariate Information Bottleneck

The Information bottleneck method is an unsupervised non-parametric data organization technique. Given a joint distribution P(A,B), this method constructs a new variable T that extracts partitions, or clusters, over the values of A that are…

Machine Learning · Computer Science 2013-01-14 Nir Friedman , Ori Mosenzon , Noam Slonim , Naftali Tishby

Differentiable Information Bottleneck for Deterministic Multi-view Clustering

In recent several years, the information bottleneck (IB) principle provides an information-theoretic framework for deep multi-view clustering (MVC) by compressing multi-view observations while preserving the relevant information of multiple…

Information Theory · Computer Science 2024-03-26 Xiaoqiang Yan , Zhixiang Jin , Fengshou Han , Yangdong Ye

Sparse clustering via the Deterministic Information Bottleneck algorithm

Cluster analysis relates to the task of assigning objects into groups which ideally present some desirable characteristics. When a cluster structure is confined to a subset of the feature space, traditional clustering techniques face…

Machine Learning · Statistics 2026-04-14 Efthymios Costa , Ioanna Papatsouma , Angelos Markos

The information bottleneck and geometric clustering

The information bottleneck (IB) approach to clustering takes a joint distribution $P\!\left(X,Y\right)$ and maps the data $X$ to cluster labels $T$ which retain maximal information about $Y$ (Tishby et al., 1999). This objective results in…

Machine Learning · Statistics 2020-06-02 DJ Strouse , David J Schwab

Hierarchical clustering of mixed-type data based on barycentric coding

Clustering of mixed-type datasets can be a particularly challenging task as it requires taking into account the associations between variables with different level of measurement, i.e., nominal, ordinal and/or interval. In some cases,…

Methodology · Statistics 2022-04-22 Odysseas Moschidis , Angelos Markos , Theodore Chadjipadelis

Mixed data Deep Gaussian Mixture Model: A clustering model for mixed datasets

Clustering mixed data presents numerous challenges inherent to the very heterogeneous nature of the variables. A clustering algorithm should be able, despite of this heterogeneity, to extract discriminant pieces of information from the…

Machine Learning · Computer Science 2022-05-10 Robin Fuchs , Denys Pommeret , Cinzia Viroli

Clustering Approaches for Mixed-Type Data: A Comparative Study

Clustering is widely used in unsupervised learning to find homogeneous groups of observations within a dataset. However, clustering mixed-type data remains a challenge, as few existing approaches are suited for this task. This study…

Machine Learning · Statistics 2025-11-26 Badih Ghattas , Alvaro Sanchez San-Benito

Dynamic Multimodal Information Bottleneck for Multimodality Classification

Effectively leveraging multimodal data such as various images, laboratory tests and clinical information is gaining traction in a variety of AI-based medical diagnosis and prognosis tasks. Most existing multi-modal techniques only focus on…

Image and Video Processing · Electrical Eng. & Systems 2023-11-28 Yingying Fang , Shuang Wu , Sheng Zhang , Chaoyan Huang , Tieyong Zeng , Xiaodan Xing , Simon Walsh , Guang Yang

Variable selection for mixed data clustering: a model-based approach

We propose two approaches for selecting variables in latent class analysis (i.e.,mixture model assuming within component independence), which is the common model-based clustering method for mixed data. The first approach consists in…

Computation · Statistics 2017-03-08 Matthieu Marbac , Mohammed Sedki

Document Clustering using Sequential Information Bottleneck Method

This paper illustrates the Principal Direction Divisive Partitioning (PDDP) algorithm and describes its drawbacks and introduces a combinatorial framework of the Principal Direction Divisive Partitioning (PDDP) algorithm, then describes the…

Information Retrieval · Computer Science 2010-04-13 P. J. Gayathri , S. C. Punitha , M. Punithavalli

Identifying the number of clusters in discrete mixture models

Research on cluster analysis for categorical data continues to develop, with new clustering algorithms being proposed. However, in this context, the determination of the number of clusters is rarely addressed. In this paper, we propose a…

Methodology · Statistics 2014-09-29 Cláudia Silvestre , Margarida G. M. S. Cardoso , Mário A. T. Figueiredo

Simultaneous Clustering and Model Selection for Multinomial Distribution: A Comparative Study

In this paper, we study different discrete data clustering methods, which use the Model-Based Clustering (MBC) framework with the Multinomial distribution. Our study comprises several relevant issues, such as initialization, model…

Machine Learning · Computer Science 2015-09-08 Md. Abul Hasnat , Julien Velcin , Stéphane Bonnevay , Julien Jacques

Demystifying Information-Theoretic Clustering

We propose a novel method for clustering data which is grounded in information-theoretic principles and requires no parametric assumptions. Previous attempts to use information theory to define clusters in an assumption-free way are based…

Machine Learning · Computer Science 2014-02-07 Greg Ver Steeg , Aram Galstyan , Fei Sha , Simon DeDeo

Mixed Data Clustering Survey and Challenges

The advent of the big data paradigm has transformed how industries manage and analyze information, ushering in an era of unprecedented data volume, velocity, and variety. Within this landscape, mixed-data clustering has become a critical…

Machine Learning · Computer Science 2025-12-04 Guillaume Guerard , Sonia Djebali

Novel Feature-Based Clustering of Micro-Panel Data (CluMP)

Micro-panel data are collected and analysed in many research and industry areas. Cluster analysis of micro-panel data is an unsupervised learning exploratory method identifying subgroup clusters in a data set which include homogeneous…

Machine Learning · Statistics 2018-07-17 Lukas Sobisek , Maria Stachova , Jan Fojtik

Model Based Clustering for Mixed Data: clustMD

A model based clustering procedure for data of mixed type, clustMD, is developed using a latent variable model. It is proposed that a latent variable, following a mixture of Gaussian distributions, generates the observed data of mixed type.…

Methodology · Statistics 2015-11-06 Damien McParland , Isobel Claire Gormley

The Dual Information Bottleneck

The Information Bottleneck (IB) framework is a general characterization of optimal representations obtained using a principled approach for balancing accuracy and complexity. Here we present a new framework, the Dual Information Bottleneck…

Information Theory · Computer Science 2020-06-09 Zoe Piran , Ravid Shwartz-Ziv , Naftali Tishby

AugDMC: Data Augmentation Guided Deep Multiple Clustering

Clustering aims to group similar objects together while separating dissimilar ones apart. Thereafter, structures hidden in data can be identified to help understand data in an unsupervised manner. Traditional clustering methods such as…

Computer Vision and Pattern Recognition · Computer Science 2023-06-23 Jiawei Yao , Enbei Liu , Maham Rashid , Juhua Hu

Informed Asymmetric Dirichlet Priors for Multivariate Bernoulli Mixture Models

Clustering multivariate binary data is of interest in many scientific fields, including ecology, biomedicine, and social policy. Beyond heuristic clustering algorithms, such data can be modelled using multivariate Bernoulli mixture models.…

Methodology · Statistics 2026-04-24 Luisa Ferrari , Maria Franco Villoria , Garritt L. Page , Alex Laini

Too Much Information Kills Information: A Clustering Perspective

Clustering is one of the most fundamental tools in the artificial intelligence area, particularly in the pattern recognition and learning theory. In this paper, we propose a simple, but novel approach for variance-based k-clustering tasks,…

Machine Learning · Computer Science 2020-09-17 Yicheng Xu , Vincent Chau , Chenchen Wu , Yong Zhang , Vassilis Zissimopoulos , Yifei Zou