Related papers: Bayesian Consensus Clustering
Classically, Bayesian clustering interprets each component of a mixture model as a cluster. The inferred clustering posterior is highly sensitive to any inaccuracies in the kernel within each component. As this kernel is made more flexible,…
Bi-clustering is a useful approach in analyzing biological data when observations come from heterogeneous groups and have a large number of features. We outline a general Bayesian approach in tackling bi-clustering problems in moderate to…
Clustering of proteins is of interest in cancer cell biology. This article proposes a hierarchical Bayesian model for protein (variable) clustering hinging on correlation structure. Starting from a multivariate normal likelihood, we enforce…
We present a new model-based integrative method for clustering objects given both vectorial data, which describes the feature of each object, and network data, which indicates the similarity of connected objects. The proposed general model…
Finding patient subgroups with similar characteristics is crucial for personalized decision-making in various disciplines such as healthcare and policy evaluation. While most existing approaches rely on unsupervised clustering methods,…
Clustering is a crucial task in various domains of knowledge, including medicine, epidemiology, genomics, environmental science, economics, and visual sciences, among others. Methodologies for inferring the number of clusters have often…
Cancer is a number of related yet highly heterogeneous diseases. Correct identification of cancer subtypes is critical for clinical decisions. The advance in sequencing technologies has made it possible to study cancer based on abundant…
In cluster analysis interest lies in probabilistically capturing partitions of individuals, items or observations into groups, such that those belonging to the same group share similar attributes or relational profiles. Bayesian posterior…
Clustering is commonly performed as an initial analysis step for uncovering structure in 'omics datasets, e.g. to discover molecular subtypes of disease. The high-throughput, high-dimensional nature of these datasets means that they provide…
We consider comparisons of statistical learning algorithms using multiple data sets, via leave-one-in cross-study validation: each of the algorithms is trained on one data set; the resulting model is then validated on each remaining data…
The community structure of complex networks reveals both their organization and hidden relationships among their constituents. Most community detection methods currently available are not deterministic, and their results typically depend on…
Mixture model-based frameworks are very popular for statistical inference in clustering. While convenient for producing probabilistic estimates of cluster assignments and uncertainty, they are prone to misspecification, which can lead to…
Bayesian clustering typically relies on mixture models, with each component interpreted as a different cluster. After defining a prior for the component parameters and weights, Markov chain Monte Carlo (MCMC) algorithms are commonly used to…
Network models provide a powerful framework for analysing single-cell count data, facilitating the characterisation of cellular identities, disease mechanisms, and developmental trajectories. However, uncertainty modeling in unsupervised…
We present an approach to model-based hierarchical clustering by formulating an objective function based on a Bayesian analysis. This model organizes the data into a cluster hierarchy while specifying a complex feature-set partitioning that…
Rapid technological advances have allowed for molecular profiling across multiple omics domains from a single sample for clinical decision making in many diseases, especially cancer. As tumor development and progression are dynamic…
We present a novel framework for concomitant dimension reduction and clustering. This framework is based on a novel class of Bayesian clustering factor models. These models assume a factor model structure where the vectors of common factors…
The clustering algorithms that view each object data as a single sample drawn from a certain distribution, Gaussian distribution, for example, has been a hot topic for decades. Many clustering algorithms: such as k-means and spectral…
Various methods have been developed to combine inference across multiple sets of results for unsupervised clustering, within the ensemble clustering literature. The approach of reporting results from one `best' model out of several…
In many modern applications, there is interest in analyzing enormous data sets that cannot be easily moved across computers or loaded into memory on a single computer. In such settings, it is very common to be interested in clustering.…