Related papers: A Bayesian non-parametric method for clustering hi…

Sparse Bayesian Hierarchical Modeling of High-dimensional Clustering Problems

Clustering is one of the most widely used procedures in the analysis of microarray data, for example with the goal of discovering cancer subtypes based on observed heterogeneity of genetic marks between different tissues. It is well-known…

Methodology · Statistics 2009-04-21 Heng Lian

A Nonparametric Bayesian Method for Clustering of High-Dimensional Mixed Dataset

The paper is motivated from clustering problem in high-throughput mixed datasets. Clustering of such datasets can provide much insight into biological associations. An open problem in this context is to simultaneously cluster…

Methodology · Statistics 2018-08-15 Chetkar Jha

Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts

We present a Bayesian nonparametric framework for multilevel clustering which utilizes group-level context information to simultaneously discover low-dimensional structures of the group contents and partitions groups into clusters. Using…

Machine Learning · Computer Science 2014-01-30 Vu Nguyen , Dinh Phung , XuanLong Nguyen , Svetha Venkatesh , Hung Hai Bui

Flexible Bayesian Nonparametric Product Mixtures for Multi-scale Functional Clustering

There is a rich literature on clustering functional data with applications to time-series modeling, trajectory data, and even spatio-temporal applications. However, existing methods routinely perform global clustering that enforces…

Methodology · Statistics 2024-12-16 Tsung-Hung Yao , Suprateek Kundu

Determinantal Clustering Processes - A Nonparametric Bayesian Approach to Kernel Based Semi-Supervised Clustering

Semi-supervised clustering is the task of clustering data points into clusters where only a fraction of the points are labelled. The true number of clusters in the data is often unknown and most models require this parameter as an input.…

Machine Learning · Computer Science 2013-09-27 Amar Shah , Zoubin Ghahramani

A Sparse Factor Model for Clustering High-Dimensional Longitudinal Data

Recent advances in engineering technologies have enabled the collection of a large number of longitudinal features. This wealth of information presents unique opportunities for researchers to investigate the complex nature of diseases and…

Methodology · Statistics 2023-11-27 Zihang Lu , Noirrit Kiran Chandra

Informed Asymmetric Dirichlet Priors for Multivariate Bernoulli Mixture Models

Clustering multivariate binary data is of interest in many scientific fields, including ecology, biomedicine, and social policy. Beyond heuristic clustering algorithms, such data can be modelled using multivariate Bernoulli mixture models.…

Methodology · Statistics 2026-04-24 Luisa Ferrari , Maria Franco Villoria , Garritt L. Page , Alex Laini

Revisiting k-means: New Algorithms via Bayesian Nonparametrics

Bayesian models offer great flexibility for clustering applications---Bayesian nonparametrics can be used for modeling infinite mixtures, and hierarchical Bayesian models can be utilized for sharing clusters across multiple data sets. For…

Machine Learning · Computer Science 2012-06-15 Brian Kulis , Michael I. Jordan

Global-Local Dirichlet Processes for Identifying Pan-Cancer Subpopulations Using Both Shared and Cancer-Specific Data

We consider the problem of clustering grouped data for which the observations may include group-specific variables in addition to the variables that are shared across groups. This type of data is common in cancer genomics where the…

Methodology · Statistics 2025-09-30 Arhit Chakrabarti , Yang Ni , Debdeep Pati , Bani K. Mallick

Flexible clustering via hidden hierarchical Dirichlet priors

The Bayesian approach to inference stands out for naturally allowing borrowing information across heterogeneous populations, with different samples possibly sharing the same distribution. A popular Bayesian nonparametric model for…

Methodology · Statistics 2022-01-25 Antonio Lijoi , Igor Prünster , Giovanni Rebaudo

Model-Based Hierarchical Clustering

We present an approach to model-based hierarchical clustering by formulating an objective function based on a Bayesian analysis. This model organizes the data into a cluster hierarchy while specifying a complex feature-set partitioning that…

Machine Learning · Computer Science 2013-01-18 Shivakumar Vaithyanathan , Byron E Dom

Unsupervised Joint Alignment and Clustering using Bayesian Nonparametrics

Joint alignment of a collection of functions is the process of independently transforming the functions so that they appear more similar to each other. Typically, such unsupervised alignment algorithms fail when presented with complex data…

Machine Learning · Computer Science 2012-10-19 Marwan A. Mattar , Allen R. Hanson , Erik G. Learned-Miller

Bayesian Nonparametric Graph Clustering

We present clustering methods for multivariate data exploiting the underlying geometry of the graphical structure between variables. As opposed to standard approaches that assume known graph structures, we first estimate the edge structure…

Methodology · Statistics 2015-09-28 Sayantan Banerjee , Rehan Akbani , Veerabhadran Baladandayuthapani

Probabilistic Clustering of Time-Evolving Distance Data

We present a novel probabilistic clustering model for objects that are represented via pairwise distances and observed at different time points. The proposed method utilizes the information given by adjacent time points to find the…

Machine Learning · Computer Science 2015-04-16 Julia E. Vogt , Marius Kloft , Stefan Stark , Sudhir S. Raman , Sandhya Prabhakaran , Volker Roth , Gunnar Rätsch

A Fast Algorithm for Clustering High Dimensional Feature Vectors

We propose an algorithm for clustering high dimensional data. If $P$ features for $N$ objects are represented in an $N\times P$ matrix ${\bf X}$, where $N\ll P$, the method is based on exploiting the cluster-dependent structure of the…

Machine Learning · Statistics 2018-11-05 Shahina Rahman , Valen E. Johnson

Bayesian Mixture Models for Frequent Itemset Discovery

In binary-transaction data-mining, traditional frequent itemset mining often produces results which are not straightforward to interpret. To overcome this problem, probability models are often used to produce more compact and conclusive…

Machine Learning · Computer Science 2012-09-27 Ruefei He , Jonathan Shapiro

Conjoined Dirichlet Process

Biclustering is a class of techniques that simultaneously clusters the rows and columns of a matrix to sort heterogeneous data into homogeneous blocks. Although many algorithms have been proposed to find biclusters, existing methods suffer…

Machine Learning · Statistics 2020-02-11 Michelle N. Ngo , Dustin S. Pluta , Alexander N. Ngo , Babak Shahbaba

Distributed Bayesian clustering using finite mixture of mixtures

In many modern applications, there is interest in analyzing enormous data sets that cannot be easily moved across computers or loaded into memory on a single computer. In such settings, it is very common to be interested in clustering.…

Computation · Statistics 2020-05-15 Hanyu Song , Yingjian Wang , David B. Dunson

Dirichlet Process Parsimonious Mixtures for clustering

The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general…

Machine Learning · Statistics 2018-10-18 Faicel Chamroukhi , Marius Bartcus , Hervé Glotin

Bayesian mixture models (in)consistency for the number of clusters

Bayesian nonparametric mixture models are common for modeling complex data. While these models are well-suited for density estimation, recent results proved posterior inconsistency of the number of clusters when the true number of…

Statistics Theory · Mathematics 2024-05-31 Louise Alamichel , Daria Bystrova , Julyan Arbel , Guillaume Kon Kam King