Related papers: Conditional partial exchangeability: a probabilist…

Partially Observed Exchangeable Modeling

Modeling dependencies among features is fundamental for many machine learning tasks. Although there are often multiple related instances that may be leveraged to inform conditional dependencies, typical approaches only model conditional…

Machine Learning · Computer Science 2021-02-12 Yang Li , Junier B. Oliva

Exchangeability, prediction and predictive modeling in Bayesian statistics

There is currently a renewed interest in the Bayesian predictive approach to statistics. This paper offers a review on foundational concepts and focuses on predictive modeling, which by directly reasoning on prediction, bypasses inferential…

Statistics Theory · Mathematics 2024-11-22 Sandra Fortini , Sonia Petrone

Model-based clustering for conditionally correlated categorical data

An extension of the latent class model is presented for clustering categorical data by relaxing the classical "class conditional independence assumption" of variables. This model consists in grouping the variables into inter-independent and…

Computation · Statistics 2015-10-01 Matthieu Marbac , Christophe Biernacki , Vincent Vandewalle

Multivariate Species Sampling Models

Species sampling processes have long served as the fundamental framework for modeling random discrete distributions and exchangeable sequences. However, data arising from distinct but related sources require a broader notion of…

Statistics Theory · Mathematics 2026-02-03 Beatrice Franzolini , Antonio Lijoi , Igor Prünster , Giovanni Rebaudo

Tractability through Exchangeability: A New Perspective on Efficient Probabilistic Inference

Exchangeability is a central notion in statistics and probability theory. The assumption that an infinite sequence of data points is exchangeable is at the core of Bayesian statistics. However, finite exchangeability as a statistical…

Artificial Intelligence · Computer Science 2014-04-24 Mathias Niepert , Guy Van den Broeck

Learning Densities Conditional on Many Interacting Features

Learning a distribution conditional on a set of discrete-valued features is a commonly encountered task. This becomes more challenging with a high-dimensional feature set when there is the possibility of interaction between the features. In…

Machine Learning · Statistics 2013-05-01 David C. Kessler , Jack Taylor , David B. Dunson

The role of exchangeability in causal inference

Though the notion of exchangeability has been discussed in the causal inference literature under various guises, it has rarely taken its original meaning as a symmetry property of probability distributions. As this property is a standard…

Methodology · Statistics 2023-10-04 Olli Saarela , David A. Stephens , Erica E. M. Moodie

Model-Based Longitudinal Clustering with Varying Cluster Assignments

It is often of interest to perform clustering on longitudinal data, yet it is difficult to formulate an intuitive model for which estimation is computationally feasible. We propose a model-based clustering method for clustering objects that…

Methodology · Statistics 2020-05-19 Daniel K. Sewell , Yuguo Chen , William Bernhard , Tracy Sulkin

Enhancing the selection of a model-based clustering with external qualitative variables

In cluster analysis, it can be useful to interpret the partition built from the data in the light of external categorical variables which were not directly involved to cluster the data. An approach is proposed in the model-based clustering…

Methodology · Statistics 2013-07-18 Jean-Patrick Baudry , Margarida Cardoso , Gilles Celeux , Maria José Amorim , Ana Sousa Ferreira

Variable selection for clustering with Gaussian mixture models: state of the art

The mixture models have become widely used in clustering, given its probabilistic framework in which its based, however, for modern databases that are characterized by their large size, these models behave disappointingly in setting out the…

Machine Learning · Statistics 2017-02-01 Abdelghafour Talibi , Boujemâa Achchab , Rafik Lasri

Advances in Bayesian random partition models: A comprehensive review

Clustering is a crucial task in various domains of knowledge, including medicine, epidemiology, genomics, environmental science, economics, and visual sciences, among others. Methodologies for inferring the number of clusters have often…

Methodology · Statistics 2025-05-26 Clara Grazian

Informed Random Partition Models with Temporal Dependence

Model-based clustering is a powerful tool that is often used to discover hidden structure in data by grouping observational units that exhibit similar response values. Recently, clustering methods have been developed that permit…

Methodology · Statistics 2025-06-24 Sally Paganin , Garritt L. Page , Fernando Andrés Quintana

Fully Bayesian Estimation under Dependent and Informative Cluster Sampling

Survey data are often collected under multistage sampling designs where units are binned to clusters that are sampled in a first stage. The unit-indexed population variables of interest are typically dependent within cluster. We propose a…

Methodology · Statistics 2021-08-26 Luis G. Leon-Novelo , Terrance D. Savitsky

Sample Size Dependent Species Models

Motivated by the fundamental problem of measuring species diversity, this paper introduces the concept of a cluster structure to define an exchangeable cluster probability function that governs the joint distribution of a random count and…

Methodology · Statistics 2014-10-14 Mingyuan Zhou , Stephen G Walker

Introducing Feature-Based Trajectory Clustering, a clustering algorithm for longitudinal data

We present a new algorithm for clustering longitudinal data. Data of this type can be conceptualized as consisting of individuals and, for each such individual, observations of a time-dependent variable made at various times. Generically,…

Machine Learning · Computer Science 2026-03-17 Marie-Pierre Sylvestre , Laurence Boulanger

Bayesian Level Set Clustering

Classically, Bayesian clustering interprets each component of a mixture model as a cluster. The inferred clustering posterior is highly sensitive to any inaccuracies in the kernel within each component. As this kernel is made more flexible,…

Methodology · Statistics 2025-12-12 David Buch , Miheer Dewaskar , David B. Dunson

Identifiability of Nonparametric Mixture Models and Bayes Optimal Clustering

Motivated by problems in data clustering, we establish general conditions under which families of nonparametric mixture models are identifiable, by introducing a novel framework involving clustering overfitted \emph{parametric} (i.e.…

Statistics Theory · Mathematics 2020-02-19 Bryon Aragam , Chen Dan , Eric P. Xing , Pradeep Ravikumar

Latent Modularity in Multi-View Data

In this article, we consider the problem of clustering multi-view data, that is, information associated to individuals that form heterogeneous data sources (the views). We adopt a Bayesian model and in the prior structure we assume that…

Methodology · Statistics 2025-11-04 Andrea Cremaschi , Maria De Iorio , Garritt Page , Ajay Jasra

Finite mixture model of conditional dependencies modes to cluster categorical data

We propose a parsimonious extension of the classical latent class model to cluster categorical data by relaxing the class conditional independence assumption. Under this new mixture model, named Conditional Modes Model, variables are…

Methodology · Statistics 2014-02-21 Matthieu Marbac , Christophe Biernacki , Vincent Vandewalle

Why the Rich Get Richer? On the Balancedness of Random Partition Models

Random partition models are widely used in Bayesian methods for various clustering tasks, such as mixture models, topic models, and community detection problems. While the number of clusters induced by random partition models has been…

Machine Learning · Statistics 2022-06-22 Changwoo J. Lee , Huiyan Sang