Related papers: Model-based Clustering using Automatic Differentia…

Improved Inference of Gaussian Mixture Copula Model for Clustering and Reproducibility Analysis using Automatic Differentiation

Copulas provide a modular parameterization of multivariate distributions that decouples the modeling of marginals from the dependencies between them. Gaussian Mixture Copula Model (GMCM) is a highly flexible copula that can model many kinds…

Methodology · Statistics 2021-09-29 Siva Rajesh Kasa , Vaibhav Rajan

Clustering and Model Selection via Penalized Likelihood for Different-sized Categorical Data Vectors

In this study, we consider unsupervised clustering of categorical vectors that can be of different size using mixture. We use likelihood maximization to estimate the parameters of the underlying mixture model and a penalization technique to…

Statistics Theory · Mathematics 2017-09-08 Esther Derman , Erwan Le Pennec

Group-wise shrinkage estimation in penalized model-based clustering

Finite Gaussian mixture models provide a powerful and widely employed probabilistic approach for clustering multivariate continuous data. However, the practical usefulness of these models is jeopardized in high-dimensional spaces, where…

Methodology · Statistics 2022-05-13 Alessandro Casa , Andrea Cappozzo , Michael Fop

Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables

Clustering analysis is one of the most widely used statistical tools in many emerging areas such as microarray data analysis. For microarray and other high-dimensional data, the presence of many noise variables may mask underlying…

Machine Learning · Statistics 2008-03-26 Benhuai Xie , Wei Pan , Xiaotong Shen

High-dimensional cluster analysis with the Masked EM Algorithm

Cluster analysis faces two problems in high dimensions: first, the `curse of dimensionality' that can lead to overfitting and poor generalization performance; and second, the sheer time taken for conventional algorithms to process large…

Quantitative Methods · Quantitative Biology 2013-09-12 Shabnam N. Kadir , Dan F. M. Goodman , Kenneth D. Harris

Automatic Differentiation in Mixture Models

In this article, we discuss two specific classes of models - Gaussian Mixture Copula models and Mixture of Factor Analyzers - and the advantages of doing inference with gradient descent using automatic differentiation. Gaussian mixture…

Computation · Statistics 2018-12-17 Siva Rajesh Kasa , Vaibhav Rajan

Variable selection for model-based clustering using the integrated complete-data likelihood

Variable selection in cluster analysis is important yet challenging. It can be achieved by regularization methods, which realize a trade-off between the clustering accuracy and the number of selected variables by using a lasso-type penalty.…

Methodology · Statistics 2016-12-23 Marbac Matthieu , Sedki Mohammed

Model-based Clustering with Sparse Covariance Matrices

Finite Gaussian mixture models are widely used for model-based clustering of continuous data. Nevertheless, since the number of model parameters scales quadratically with the number of variables, these models can be easily…

Methodology · Statistics 2018-09-25 Michael Fop , Thomas Brendan Murphy , Luca Scrucca

Improved initialisation of model-based clustering using Gaussian hierarchical partitions

Initialisation of the EM algorithm in model-based clustering is often crucial. Various starting points in the parameter space often lead to different local maxima of the likelihood function and, so to different clustering partitions. Among…

Methodology · Statistics 2015-07-28 Luca Scrucca , Adrian E. Raftery

Time Series Clustering with an EM algorithm for Mixtures of Linear Gaussian State Space Models

In this paper, we consider the task of clustering a set of individual time series while modeling each cluster, that is, model-based time series clustering. The task requires a parametric model with sufficient flexibility to describe the…

Machine Learning · Computer Science 2023-02-23 Ryohei Umatani , Takashi Imai , Kaoru Kawamoto , Shutaro Kunimasa

Flexible Clustering with a Sparse Mixture of Generalized Hyperbolic Distributions

Robust clustering of high-dimensional data is an important topic because clusters in real datasets are often heavy-tailed and/or asymmetric. Traditional approaches to model-based clustering often fail for high dimensional data, e.g., due to…

Methodology · Statistics 2024-06-07 Alexa A. Sochaniwsky , Michael P. B. Gallaugher , Yang Tang , Paul D. McNicholas

Multinomial Cluster-Weighted Models for High-Dimensional Data

Modeling of high-dimensional data is very important to categorize different classes. We develop a new mixture model called Multinomial cluster-weighted model (MCWM). We derive the identifiability of a general class of MCWM. We estimate the…

Methodology · Statistics 2022-08-25 Kehinde Olobatuyi , Oludare Ariyo

On perfect clustering of high dimension, low sample size data

Popular clustering algorithms based on usual distance functions (e.g., Euclidean distance) often suffer in high dimension, low sample size (HDLSS) situations, where concentration of pairwise distances has adverse effects on their…

Methodology · Statistics 2019-05-03 Soham Sarkar , Anil K. Ghosh

A Model-Based Clustering Approach for Bounded Data Using Transformation-Based Gaussian Mixture Models

The clustering of bounded data presents unique challenges in statistical analysis due to the constraints imposed on the data values. This paper introduces a novel method for model-based clustering specifically designed for bounded data.…

Methodology · Statistics 2025-05-16 Luca Scrucca

Robust EM algorithm for model-based curve clustering

Model-based clustering approaches concern the paradigm of exploratory data analysis relying on the finite mixture model to automatically find a latent structure governing observed data. They are one of the most popular and successful…

Methodology · Statistics 2014-04-29 Faicel Chamroukhi

Improving Model Choice in Classification: An Approach Based on Clustering of Covariance Matrices

This work introduces a refinement of the Parsimonious Model for fitting a Gaussian Mixture. The improvement is based on the consideration of clusters of the involved covariance matrices according to a criterion, such as sharing Principal…

Methodology · Statistics 2024-04-10 David Rodríguez-Vítores , Carlos Matrán

Cluster-Specific Predictions with Multi-Task Gaussian Processes

A model involving Gaussian processes (GPs) is introduced to simultaneously handle multi-task learning, clustering, and prediction for multiple functional data. This procedure acts as a model-based clustering method for functional data as…

Machine Learning · Computer Science 2023-01-24 Arthur Leroy , Pierre Latouche , Benjamin Guedj , Servane Gey

Deep Conditional Gaussian Mixture Model for Constrained Clustering

Constrained clustering has gained significant attention in the field of machine learning as it can leverage prior information on a growing amount of only partially labeled data. Following recent advances in deep generative models, we…

Machine Learning · Computer Science 2022-02-02 Laura Manduchi , Kieran Chin-Cheong , Holger Michel , Sven Wellmann , Julia E. Vogt

Hard-Clustering with Gaussian Mixture Models

Training the parameters of statistical models to describe a given data set is a central task in the field of data mining and machine learning. A very popular and powerful way of parameter estimation is the method of maximum likelihood…

Machine Learning · Computer Science 2016-03-22 Johannes Blömer , Sascha Brauer , Kathrin Bujna

Joint Representation Learning and Clustering via Gradient-Based Manifold Optimization

Clustering and dimensionality reduction have been crucial topics in machine learning and computer vision. Clustering high-dimensional data has been challenging for a long time due to the curse of dimensionality. For that reason, a more…

Machine Learning · Statistics 2026-04-16 Sida Liu , Yangzi Guo , Mingyuan Wang