Related papers: Using Gaussian Measures for Efficient Constraint B…
We propose a computationally simple framework for clustering functional data based on Gaussian-process-generated random projections. In this approach, each curve is first projected onto a large collection of independent Gaussian process…
Clustering techniques are very attractive for extracting and identifying patterns in datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality data, heterogeneity, and high…
We propose a Fourier-based approach for optimization of several clustering algorithms. Mathematically, clusters data can be described by a density function represented by the Dirac mixture distribution. The density function can be smoothed…
Constrained clustering has gained significant attention in the field of machine learning as it can leverage prior information on a growing amount of only partially labeled data. Following recent advances in deep generative models, we…
We give an efficient algorithm for robustly clustering of a mixture of two arbitrary Gaussians, a central open problem in the theory of computationally efficient robust estimation, assuming only that the the means of the component Gaussians…
Standard Gaussian Process (GP) regression, a powerful machine learning tool, is computationally expensive when it is applied to large datasets, and potentially inaccurate when data points are sparsely distributed in a high-dimensional…
We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our…
High-dimensional clustering analysis is a challenging problem in statistics and machine learning, with broad applications such as the analysis of microarray data and RNA-seq data. In this paper, we propose a new clustering procedure called…
With inspiration from Random Forests (RF) in the context of classification, a new clustering ensemble method---Cluster Forests (CF) is proposed. Geometrically, CF randomly probes a high-dimensional data cloud to obtain "good local…
Bayesian clustering typically relies on mixture models, with each component interpreted as a different cluster. After defining a prior for the component parameters and weights, Markov chain Monte Carlo (MCMC) algorithms are commonly used to…
We present a novel framework for concomitant dimension reduction and clustering. This framework is based on a novel class of Bayesian clustering factor models. These models assume a factor model structure where the vectors of common factors…
Cluster analysis which focuses on the grouping and categorization of similar elements is widely used in various fields of research. Inspired by the phenomenon of atomic fission, a novel density-based clustering algorithm is proposed in this…
We consider the problem of clustering data points in high dimensions, i.e. when the number of data points may be much smaller than the number of dimensions. Specifically, we consider a Gaussian mixture model (GMM) with non-spherical…
The clustering of bounded data presents unique challenges in statistical analysis due to the constraints imposed on the data values. This paper introduces a novel method for model-based clustering specifically designed for bounded data.…
We derive an efficient method to perform clustering of nodes in Gaussian graphical models directly from sample data. Nodes are clustered based on the similarity of their network neighborhoods, with edge weights defined by partial…
We propose a novel methodology for feature screening in clustering massive datasets, in which both the number of features and the number of observations can potentially be very large. Taking advantage of a fusion penalization based convex…
Clustering is often used for discovering structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. Ideally, the search…
Deep multi-view clustering seeks to utilize the abundant information from multiple views to improve clustering performance. However, most of the existing clustering methods often neglect to fully mine multi-view structural information and…
This paper considers clustered multi-task compressive sensing, a hierarchical model that solves multiple compressive sensing tasks by finding clusters of tasks that leverage shared information to mutually improve signal reconstruction. The…
The paper presents the algorithm for clustering a dataset by grouping the optimal, from the point of view of the BIC criterion, number of Gaussian clusters into the optimal, from the point of view of their statistical separability,…