Related papers: Predictor-Informed Bayesian Nonparametric Clusteri…
The use of high-dimensional data for targeted therapeutic interventions requires new ways to characterize the heterogeneity observed across subgroups of a specific population. In particular, models for partially exchangeable data are needed…
The Bayesian approach to inference stands out for naturally allowing borrowing information across heterogeneous populations, with different samples possibly sharing the same distribution. A popular Bayesian nonparametric model for…
We present a Bayesian approach to predict the clustering of opinions for a system of interacting agents from partial observations. The Bayesian formulation overcomes the unobservability of the system and quantifies the uncertainty in the…
In the framework of model-based clustering, a model, called multi-partitions clustering, allowing several latent class variables has been proposed. This model assumes that the distribution of the observed data can be factorized into several…
Linear mixed models are widely used for analyzing hierarchically structured data involving missingness and unbalanced study designs. We consider a Bayesian clustering method that combines linear mixed models and predictive projections. For…
A model-based approach is developed for clustering categorical data with no natural ordering. The proposed method exploits the Hamming distance to define a family of probability mass functions to model the data. The elements of this family…
We propose a novel method for multiple clustering that assumes a co-clustering structure (partitions in both rows and columns of the data matrix) in each view. The new method is applicable to high-dimensional data. It is based on a…
We propose the Plaid Atoms Model (PAM), a novel Bayesian nonparametric model for grouped data. Founded on an idea of `atom skipping', PAM is part of a well-established category of models that generate dependent random distributions and…
We consider the problem of clustering nested or hierarchical data, where observations are grouped and there are both group-level and observation-level variables. In our motivating OneK1K dataset, observations consist of single-cell…
Bayesian model-based clustering is a widely applied procedure for discovering groups of related observations in a dataset. These approaches use Bayesian mixture models, estimated with MCMC, which provide posterior samples of the model…
In this paper, we propose a general framework for combining evidence of varying quality to estimate underlying binary latent variables in the presence of restrictions imposed to respect the scientific context. The resulting algorithms…
In this work, we propose an original method for aggregating multiple clustering coming from different sources of information. Each partition is encoded by a co-membership matrix between observations. Our approach uses a mixture of…
We introduce a random partition model for Bayesian nonparametric regression. The model is based on infinitely-many disjoint regions of the range of a latent covariate-dependent Gaussian process. Given a realization of the process, the…
Change-point models deal with ordered data sequences. Their primary goal is to infer the locations where an aspect of the data sequence changes. In this paper, we propose and implement a nonparametric Bayesian model for clustering…
Clustering is a crucial task in various domains of knowledge, including medicine, epidemiology, genomics, environmental science, economics, and visual sciences, among others. Methodologies for inferring the number of clusters have often…
We introduce a Bayesian nonparametric inference approach for aggregate adverse event (AE) monitoring across studies. The proposed model seamlessly integrates external data from historical trials to define a relevant background rate and…
Functional concurrent, or varying-coefficient, regression models are commonly used in biomedical and clinical settings to investigate how the relation between an outcome and observed covariate varies as a function of another covariate. In…
A model based clustering procedure for data of mixed type, clustMD, is developed using a latent variable model. It is proposed that a latent variable, following a mixture of Gaussian distributions, generates the observed data of mixed type.…
Motivated by the problem of accurately predicting gap times between successive blood donations, we present here a general class of Bayesian nonparametric models for clustering. These models allow for prediction of new recurrences,…
Semi-supervised clustering is the task of clustering data points into clusters where only a fraction of the points are labelled. The true number of clusters in the data is often unknown and most models require this parameter as an input.…