Related papers: Neural Clustering Processes
Mixture models and topic models generate each observation from a single cluster, but standard variational posteriors for each observation assign positive probability to all possible clusters. This requires dense storage and runtime costs…
The use of a finite mixture of normal distributions in model-based clustering allows to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by…
Mixture models extend the toolbox of clustering methods available to the data analyst. They allow for an explicit definition of the cluster shapes and structure within a probabilistic framework and exploit estimation and inference…
We propose a novel end-to-end neural network architecture that, once trained, directly outputs a probabilistic clustering of a batch of input examples in one pass. It estimates a distribution over the number of clusters $k$, and for each $1…
Clustering is a NP-hard problem. Thus, no optimal algorithm exists, heuristics are applied to cluster the data. Heuristics can be very resource-intensive, if not applied properly. For substantially large data sets computational efficiencies…
Any clustering algorithm must synchronously learn to model the clusters and allocate data to those clusters in the absence of labels. Mixture model-based methods model clusters with pre-defined statistical distributions and allocate data to…
Clustering is a widely used technique with a long and rich history in a variety of areas. However, most existing algorithms do not scale well to large datasets, or are missing theoretical guarantees of convergence. This paper introduces a…
This paper presents a neural network-based end-to-end clustering framework. We design a novel strategy to utilize the contrastive criteria for pushing data-forming clusters directly from raw data, in addition to learning a feature embedding…
We propose a Bayesian approach for model-based clustering of multivariate categorical data where variables are allowed to be associated within clusters and the number of clusters is unknown. The approach uses a two-layer mixture of finite…
Bayesian clustering methods have the widely touted advantage of providing a probabilistic characterization of uncertainty in clustering through the posterior distribution. An amazing variety of priors and likelihoods have been proposed for…
We introduce the Neural Collaborative Subspace Clustering, a neural model that discovers clusters of data points drawn from a union of low-dimensional subspaces. In contrast to previous attempts, our model runs without the aid of spectral…
As a kind of basic machine learning method, clustering algorithms group data points into different categories based on their similarity or distribution. We present a clustering algorithm by finding hyper-planes to distinguish the data…
Mixture models are flexible tools in density estimation and classification problems. Bayesian estimation of such models typically relies on sampling from the posterior distribution using Markov chain Monte Carlo. Label switching arises…
The paper tackles the problem of clustering multiple networks, directed or not, that do not share the same set of vertices, into groups of networks with similar topology. A statistical model-based approach based on a finite mixture of…
Mixture model-based frameworks are very popular for statistical inference in clustering. While convenient for producing probabilistic estimates of cluster assignments and uncertainty, they are prone to misspecification, which can lead to…
Patchwork learning arises as a new and challenging data collection paradigm where both samples and features are observed in fragmented subsets. Due to technological limits, measurement expense, or multimodal data integration, such patchwork…
Clustering is widely used in unsupervised learning to find homogeneous groups of observations within a dataset. However, clustering mixed-type data remains a challenge, as few existing approaches are suited for this task. This study…
Probabilistic mixture models have been widely used for different machine learning and pattern recognition tasks such as clustering, dimensionality reduction, and classification. In this paper, we focus on trying to solve the most common…
Statistical methods for reconstructing networks from repeated measurements typically assume that all measurements are generated from the same underlying network structure. This need not be the case, however. People's social networks might…
Dirichlet process mixtures are flexible non-parametric models, particularly suited to density estimation and probabilistic clustering. In this work we study the posterior distribution induced by Dirichlet process mixtures as the sample size…