English
Related papers

Related papers: Document Clustering using Sequential Information B…

200 papers

The information bottleneck (IB) approach to clustering takes a joint distribution $P\!\left(X,Y\right)$ and maps the data $X$ to cluster labels $T$ which retain maximal information about $Y$ (Tishby et al., 1999). This objective results in…

Machine Learning · Statistics 2020-06-02 DJ Strouse , David J Schwab

We explore the geometrical interpretation of the PCA based clustering algorithm Principal Direction Divisive Partitioning (PDDP). We give several examples where this algorithm breaks down, and suggest a new method, gap partitioning, which…

Machine Learning · Statistics 2012-11-20 Ralph Abbey , Jeremy Diepenbrock , Amy Langville , Carl Meyer , Shaina Race , Dexin Zhou

In this paper, we present an information-theoretic method for clustering mixed-type data, that is, data consisting of both continuous and categorical variables. The proposed approach extends the Information Bottleneck principle to…

Methodology · Statistics 2026-02-02 Efthymios Costa , Ioanna Papatsouma , Angelos Markos

Large Language Models (LLMs) are prone to critical failure modes, including \textit{intrinsic faithfulness hallucinations} (also known as confabulations), where a response deviates semantically from the provided context. Frameworks designed…

Computation and Language · Computer Science 2025-09-05 Igor Halperin

In recent several years, the information bottleneck (IB) principle provides an information-theoretic framework for deep multi-view clustering (MVC) by compressing multi-view observations while preserving the relevant information of multiple…

Information Theory · Computer Science 2024-03-26 Xiaoqiang Yan , Zhixiang Jin , Fengshou Han , Yangdong Ye

In this paper, we propose a semi-supervised clustering method, CEC-IB, that models data with a set of Gaussian distributions and that retrieves clusters based on a partial labeling provided by the user (partition-level side information). By…

Machine Learning · Computer Science 2017-11-15 Marek Śmieja , Bernhard C. Geiger

Lossy compression and clustering fundamentally involve a decision about what features are relevant and which are not. The information bottleneck method (IB) by Tishby, Pereira, and Bialek formalized this notion as an information-theoretic…

Neurons and Cognition · Quantitative Biology 2017-02-23 DJ Strouse , David J Schwab

Clustering is a widely deployed unsupervised learning tool. Model-based clustering is a flexible framework to tackle data heterogeneity when the clusters have different shapes. Likelihood-based inference for mixture distributions often…

Machine Learning · Statistics 2023-05-30 Yubo Zhuang , Xiaohui Chen , Yun Yang

Collaborative edge sensing systems, particularly in collaborative perception systems in autonomous driving, can significantly enhance tracking accuracy and reduce blind spots with multi-view sensing capabilities. However, their limited…

Networking and Internet Architecture · Computer Science 2024-09-02 Zhengru Fang , Senkang Hu , Liyan Yang , Yiqin Deng , Xianhao Chen , Yuguang Fang

Cluster analysis relates to the task of assigning objects into groups which ideally present some desirable characteristics. When a cluster structure is confined to a subset of the feature space, traditional clustering techniques face…

Machine Learning · Statistics 2026-04-14 Efthymios Costa , Ioanna Papatsouma , Angelos Markos

Learning with hidden variables is a central challenge in probabilistic graphical models that has important implications for many real-life problems. The classical approach is using the Expectation Maximization (EM) algorithm. This…

Machine Learning · Computer Science 2012-12-12 Gal Elidan , Nir Friedman

The Information Bottleneck (IB) framework is a general characterization of optimal representations obtained using a principled approach for balancing accuracy and complexity. Here we present a new framework, the Dual Information Bottleneck…

Information Theory · Computer Science 2020-06-09 Zoe Piran , Ravid Shwartz-Ziv , Naftali Tishby

Collaborative perception systems leverage multiple edge devices, such surveillance cameras or autonomous cars, to enhance sensing quality and eliminate blind spots. Despite their advantages, challenges such as limited channel capacity and…

Networking and Internet Architecture · Computer Science 2025-01-07 Zhengru Fang , Senkang Hu , Jingjing Wang , Yiqin Deng , Xianhao Chen , Yuguang Fang

Clustering is a NP-hard problem. Thus, no optimal algorithm exists, heuristics are applied to cluster the data. Heuristics can be very resource-intensive, if not applied properly. For substantially large data sets computational efficiencies…

Databases · Computer Science 2020-03-11 Mujahid Sultan

The Information Bottleneck (IB) principle has emerged as a promising approach for enhancing the generalization, robustness, and interpretability of deep neural networks, demonstrating efficacy across image segmentation, document clustering,…

Information Theory · Computer Science 2025-04-18 Hanzhe Yang , Youlong Wu , Dingzhu Wen , Yong Zhou , Yuanming Shi

The Information Bottleneck (IB) method is an information theoretical framework to design a parsimonious and tunable feature-extraction mechanism, such that the extracted features are maximally relevant to a specific learning or inference…

Signal Processing · Electrical Eng. & Systems 2024-04-17 Francesco Binucci , Paolo Banelli , Paolo Di Lorenzo , Sergio Barbarossa

In this paper we examine a formalization of feature distribution learning (FDL) in information-theoretic terms relying on the analytical approach and on the tools already used in the study of the information bottleneck (IB). It has been…

Machine Learning · Computer Science 2019-10-22 Fabio Massimo Zennaro , Ke Chen

The Symmetric Information Bottleneck (SIB), an extension of the more familiar Information Bottleneck, is a dimensionality reduction technique that simultaneously compresses two random variables to preserve information between their…

Information Theory · Computer Science 2024-02-06 K. Michael Martini , Ilya Nemenman

Stochastic gradient descent (SGD) is a powerful method for large-scale optimization problems in the area of machine learning, especially for a finite-sum formulation with numerous variables. In recent years, mini-batch SGD gains great…

Optimization and Control · Mathematics 2020-01-24 Kun He , Min Zhang , Jianrong Zhou , Yan Jin , Chu-min Li

Information Bottleneck (IB) is a technique to extract information about one target random variable through another relevant random variable. This technique has garnered significant interest due to its broad applications in information…

Information Theory · Computer Science 2024-04-09 Lingyi Chen , Shitong Wu , Jiachuan Ye , Huihui Wu , Wenyi Zhang , Hao Wu
‹ Prev 1 2 3 10 Next ›