Related papers: Supervised clustering of high dimensional data usi…

Multinomial Cluster-Weighted Models for High-Dimensional Data

Modeling of high-dimensional data is very important to categorize different classes. We develop a new mixture model called Multinomial cluster-weighted model (MCWM). We derive the identifiability of a general class of MCWM. We estimate the…

Methodology · Statistics 2022-08-25 Kehinde Olobatuyi , Oludare Ariyo

Simple and Scalable Algorithms for Cluster-Aware Precision Medicine

AI-enabled precision medicine promises a transformational improvement in healthcare outcomes by enabling data-driven personalized diagnosis, prognosis, and treatment. However, the well-known "curse of dimensionality" and the clustered…

Machine Learning · Computer Science 2023-05-19 Amanda M. Buch , Conor Liston , Logan Grosenick

Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables

Clustering analysis is one of the most widely used statistical tools in many emerging areas such as microarray data analysis. For microarray and other high-dimensional data, the presence of many noise variables may mask underlying…

Machine Learning · Statistics 2008-03-26 Benhuai Xie , Wei Pan , Xiaotong Shen

Dissecting Microbial Community Structure and Heterogeneity via Multivariate Covariate-Adjusted Clustering

In microbiome studies, it is often of great interest to identify clusters or partitions of microbiome profiles within a study population and to characterize the distinctive attributes of each resulting microbial community. While raw counts…

Methodology · Statistics 2025-08-18 Zhongmao Liu , Xiaohui Yin , Yanjiao Zhou , Gen Li , Kun Chen

A mixture model for subtype identification in the context of disease progression modeling

The progression of chronic diseases often follows highly variable trajectories, and the underlying factors remain poorly understood. Standard mixed-effects models typically represent inter-patient differences as random deviations around a…

Methodology · Statistics 2026-03-05 Sofia Kaisaridi , Juliette Ortholand , Caglayan Tuna , Hugues Chabriat , Sophie Tezenas du Montcel

Mixtures of Common Skew-t Factor Analyzers

A mixture of common skew-t factor analyzers model is introduced for model-based clustering of high-dimensional data. By assuming common component factor loadings, this model allows clustering to be performed in the presence of a large…

Methodology · Statistics 2014-05-05 Paula M. Murray , Paul D. McNicholas , Ryan P. Browne

Supervised Convex Clustering

Clustering has long been a popular unsupervised learning approach to identify groups of similar objects and discover patterns from unlabeled data in many applications. Yet, coming up with meaningful interpretations of the estimated clusters…

Methodology · Statistics 2020-05-26 Minjie Wang , Tianyi Yao , Genevera I. Allen

A Bayesian Finite Mixture Model Approach for Mixed-type Data Clustering and Variable Selection with Censored Biomarkers

Clustering mixed-type data remains a major challenge in biomedical research to uncover clinically meaningful subgroups within heterogeneous patient populations. Most existing clustering methods impose restrictive assumptions like local…

Applications · Statistics 2026-04-23 Yueting Wang , Shu Wang , Jonathan G. Yabes , Chung-Chou H. Chang

Conjugate Mixture Models for Clustering Multimodal Data

The problem of multimodal clustering arises whenever the data are gathered with several physically different sensors. Observations from different modalities are not necessarily aligned in the sense there there is no obvious way to associate…

Machine Learning · Statistics 2020-12-10 Vasil Khalidov , Florence Forbes , Radu Horaud

Mixed data Deep Gaussian Mixture Model: A clustering model for mixed datasets

Clustering mixed data presents numerous challenges inherent to the very heterogeneous nature of the variables. A clustering algorithm should be able, despite of this heterogeneity, to extract discriminant pieces of information from the…

Machine Learning · Computer Science 2022-05-10 Robin Fuchs , Denys Pommeret , Cinzia Viroli

Analyzing covariate clustering effects in healthcare cost subgroups: insights and applications for prediction

Healthcare cost prediction is a challenging task due to the high-dimensionality and high correlation among covariates. Additionally, the skewed, heavy-tailed, and often multi-modal nature of cost data can complicate matters further due to…

Methodology · Statistics 2023-03-13 Zhengxiao Li , Yifan Huang , Yang Cao

Model-based clustering and segmentation of time series with changes in regime

Mixture model-based clustering, usually applied to multidimensional data, has become a popular approach in many data analysis problems, both for its good statistical properties and for the simplicity of implementation of the…

Methodology · Statistics 2013-12-30 Allou Samé , Faicel Chamroukhi , Gérard Govaert , Patrice Aknin

A Multivariate Poisson-Log Normal Mixture Model for Clustering Transcriptome Sequencing Data

High-dimensional data of discrete and skewed nature is commonly encountered in high-throughput sequencing studies. Analyzing the network itself or the interplay between genes in this type of data continues to present many challenges. As…

Methodology · Statistics 2017-12-01 Anjali Silva , Steven J. Rothstein , Paul D. McNicholas , Sanjeena Subedi

Subspace clustering of high-dimensional data: a predictive approach

In several application domains, high-dimensional observations are collected and then analysed in search for naturally occurring data clusters which might provide further insights about the nature of the problem. In this paper we describe a…

Machine Learning · Statistics 2012-03-07 Brian McWilliams , Giovanni Montana

Federated unsupervised random forest for privacy-preserving patient stratification

In the realm of precision medicine, effective patient stratification and disease subtyping demand innovative methodologies tailored for multi-omics data. Clustering techniques applied to multi-omics data have become instrumental in…

Machine Learning · Computer Science 2024-01-30 Bastian Pfeifer , Christel Sirocchi , Marcus D. Bloice , Markus Kreuzthaler , Martin Urschler

Model-based clustering via skewed matrix-variate cluster-weighted models

Cluster-weighted models (CWMs) extend finite mixtures of regressions (FMRs) in order to allow the distribution of covariates to contribute to the clustering process. In a matrix-variate framework, the matrix-variate normal CWM has been…

Methodology · Statistics 2021-12-01 Michael P. B. Gallaugher , Salvatore D. Tomarchio , Paul D. McNicholas , Antonio Punzo

A Hybrid Mixture Approach for Clustering and Characterizing Cancer Data

Model-based clustering is widely used for identifying and distinguishing types of diseases. However, modern biomedical data coming with high dimensions make it challenging to perform the model estimation in traditional cluster analysis. The…

Methodology · Statistics 2025-07-22 Kazeem Kareem , Fan Dai

SiamMM: A Mixture Model Perspective on Deep Unsupervised Learning

Recent studies have demonstrated the effectiveness of clustering-based approaches for self-supervised and unsupervised learning. However, the application of clustering is often heuristic, and the optimal methodology remains unclear. In this…

Machine Learning · Computer Science 2025-11-10 Xiaodong Wang , Jing Huang , Kevin J Liang

Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions

We propose a novel method for multiple clustering that assumes a co-clustering structure (partitions in both rows and columns of the data matrix) in each view. The new method is applicable to high-dimensional data. It is based on a…

Machine Learning · Statistics 2019-07-03 Tomoki Tokuda , Junichiro Yoshimoto , Yu Shimizu , Shigeru Toki , Go Okada , Masahiro Takamura , Tetsuya Yamamoto , Shinpei Yoshimura , Yasumasa Okamoto , Shigeto Yamawaki , Kenji Doya

Clustering, Classification, Discriminant Analysis, and Dimension Reduction via Generalized Hyperbolic Mixtures

A method for dimension reduction with clustering, classification, or discriminant analysis is introduced. This mixture model-based approach is based on fitting generalized hyperbolic mixtures on a reduced subspace within the paradigm of…

Methodology · Statistics 2017-10-09 Katherine Morris , Paul D. McNicholas