English
Related papers

Related papers: Model-based Clustering using Automatic Differentia…

200 papers

Copulas provide a modular parameterization of multivariate distributions that decouples the modeling of marginals from the dependencies between them. Gaussian Mixture Copula Model (GMCM) is a highly flexible copula that can model many kinds…

Methodology · Statistics 2021-09-29 Siva Rajesh Kasa , Vaibhav Rajan

In this study, we consider unsupervised clustering of categorical vectors that can be of different size using mixture. We use likelihood maximization to estimate the parameters of the underlying mixture model and a penalization technique to…

Statistics Theory · Mathematics 2017-09-08 Esther Derman , Erwan Le Pennec

Finite Gaussian mixture models provide a powerful and widely employed probabilistic approach for clustering multivariate continuous data. However, the practical usefulness of these models is jeopardized in high-dimensional spaces, where…

Methodology · Statistics 2022-05-13 Alessandro Casa , Andrea Cappozzo , Michael Fop

Clustering analysis is one of the most widely used statistical tools in many emerging areas such as microarray data analysis. For microarray and other high-dimensional data, the presence of many noise variables may mask underlying…

Machine Learning · Statistics 2008-03-26 Benhuai Xie , Wei Pan , Xiaotong Shen

Cluster analysis faces two problems in high dimensions: first, the `curse of dimensionality' that can lead to overfitting and poor generalization performance; and second, the sheer time taken for conventional algorithms to process large…

Quantitative Methods · Quantitative Biology 2013-09-12 Shabnam N. Kadir , Dan F. M. Goodman , Kenneth D. Harris

In this article, we discuss two specific classes of models - Gaussian Mixture Copula models and Mixture of Factor Analyzers - and the advantages of doing inference with gradient descent using automatic differentiation. Gaussian mixture…

Computation · Statistics 2018-12-17 Siva Rajesh Kasa , Vaibhav Rajan

Variable selection in cluster analysis is important yet challenging. It can be achieved by regularization methods, which realize a trade-off between the clustering accuracy and the number of selected variables by using a lasso-type penalty.…

Methodology · Statistics 2016-12-23 Marbac Matthieu , Sedki Mohammed

Finite Gaussian mixture models are widely used for model-based clustering of continuous data. Nevertheless, since the number of model parameters scales quadratically with the number of variables, these models can be easily…

Methodology · Statistics 2018-09-25 Michael Fop , Thomas Brendan Murphy , Luca Scrucca

Initialisation of the EM algorithm in model-based clustering is often crucial. Various starting points in the parameter space often lead to different local maxima of the likelihood function and, so to different clustering partitions. Among…

Methodology · Statistics 2015-07-28 Luca Scrucca , Adrian E. Raftery

In this paper, we consider the task of clustering a set of individual time series while modeling each cluster, that is, model-based time series clustering. The task requires a parametric model with sufficient flexibility to describe the…

Machine Learning · Computer Science 2023-02-23 Ryohei Umatani , Takashi Imai , Kaoru Kawamoto , Shutaro Kunimasa

Robust clustering of high-dimensional data is an important topic because clusters in real datasets are often heavy-tailed and/or asymmetric. Traditional approaches to model-based clustering often fail for high dimensional data, e.g., due to…

Methodology · Statistics 2024-06-07 Alexa A. Sochaniwsky , Michael P. B. Gallaugher , Yang Tang , Paul D. McNicholas

Modeling of high-dimensional data is very important to categorize different classes. We develop a new mixture model called Multinomial cluster-weighted model (MCWM). We derive the identifiability of a general class of MCWM. We estimate the…

Methodology · Statistics 2022-08-25 Kehinde Olobatuyi , Oludare Ariyo

Popular clustering algorithms based on usual distance functions (e.g., Euclidean distance) often suffer in high dimension, low sample size (HDLSS) situations, where concentration of pairwise distances has adverse effects on their…

Methodology · Statistics 2019-05-03 Soham Sarkar , Anil K. Ghosh

The clustering of bounded data presents unique challenges in statistical analysis due to the constraints imposed on the data values. This paper introduces a novel method for model-based clustering specifically designed for bounded data.…

Methodology · Statistics 2025-05-16 Luca Scrucca

Model-based clustering approaches concern the paradigm of exploratory data analysis relying on the finite mixture model to automatically find a latent structure governing observed data. They are one of the most popular and successful…

Methodology · Statistics 2014-04-29 Faicel Chamroukhi

This work introduces a refinement of the Parsimonious Model for fitting a Gaussian Mixture. The improvement is based on the consideration of clusters of the involved covariance matrices according to a criterion, such as sharing Principal…

Methodology · Statistics 2024-04-10 David Rodríguez-Vítores , Carlos Matrán

A model involving Gaussian processes (GPs) is introduced to simultaneously handle multi-task learning, clustering, and prediction for multiple functional data. This procedure acts as a model-based clustering method for functional data as…

Machine Learning · Computer Science 2023-01-24 Arthur Leroy , Pierre Latouche , Benjamin Guedj , Servane Gey

Constrained clustering has gained significant attention in the field of machine learning as it can leverage prior information on a growing amount of only partially labeled data. Following recent advances in deep generative models, we…

Machine Learning · Computer Science 2022-02-02 Laura Manduchi , Kieran Chin-Cheong , Holger Michel , Sven Wellmann , Julia E. Vogt

Training the parameters of statistical models to describe a given data set is a central task in the field of data mining and machine learning. A very popular and powerful way of parameter estimation is the method of maximum likelihood…

Machine Learning · Computer Science 2016-03-22 Johannes Blömer , Sascha Brauer , Kathrin Bujna

Clustering and dimensionality reduction have been crucial topics in machine learning and computer vision. Clustering high-dimensional data has been challenging for a long time due to the curse of dimensionality. For that reason, a more…

Machine Learning · Statistics 2026-04-16 Sida Liu , Yangzi Guo , Mingyuan Wang
‹ Prev 1 2 3 10 Next ›