English
Related papers

Related papers: Supervised clustering of high dimensional data usi…

200 papers

Modeling of high-dimensional data is very important to categorize different classes. We develop a new mixture model called Multinomial cluster-weighted model (MCWM). We derive the identifiability of a general class of MCWM. We estimate the…

Methodology · Statistics 2022-08-25 Kehinde Olobatuyi , Oludare Ariyo

AI-enabled precision medicine promises a transformational improvement in healthcare outcomes by enabling data-driven personalized diagnosis, prognosis, and treatment. However, the well-known "curse of dimensionality" and the clustered…

Machine Learning · Computer Science 2023-05-19 Amanda M. Buch , Conor Liston , Logan Grosenick

Clustering analysis is one of the most widely used statistical tools in many emerging areas such as microarray data analysis. For microarray and other high-dimensional data, the presence of many noise variables may mask underlying…

Machine Learning · Statistics 2008-03-26 Benhuai Xie , Wei Pan , Xiaotong Shen

In microbiome studies, it is often of great interest to identify clusters or partitions of microbiome profiles within a study population and to characterize the distinctive attributes of each resulting microbial community. While raw counts…

Methodology · Statistics 2025-08-18 Zhongmao Liu , Xiaohui Yin , Yanjiao Zhou , Gen Li , Kun Chen

The progression of chronic diseases often follows highly variable trajectories, and the underlying factors remain poorly understood. Standard mixed-effects models typically represent inter-patient differences as random deviations around a…

A mixture of common skew-t factor analyzers model is introduced for model-based clustering of high-dimensional data. By assuming common component factor loadings, this model allows clustering to be performed in the presence of a large…

Methodology · Statistics 2014-05-05 Paula M. Murray , Paul D. McNicholas , Ryan P. Browne

Clustering has long been a popular unsupervised learning approach to identify groups of similar objects and discover patterns from unlabeled data in many applications. Yet, coming up with meaningful interpretations of the estimated clusters…

Methodology · Statistics 2020-05-26 Minjie Wang , Tianyi Yao , Genevera I. Allen

Clustering mixed-type data remains a major challenge in biomedical research to uncover clinically meaningful subgroups within heterogeneous patient populations. Most existing clustering methods impose restrictive assumptions like local…

Applications · Statistics 2026-04-23 Yueting Wang , Shu Wang , Jonathan G. Yabes , Chung-Chou H. Chang

The problem of multimodal clustering arises whenever the data are gathered with several physically different sensors. Observations from different modalities are not necessarily aligned in the sense there there is no obvious way to associate…

Machine Learning · Statistics 2020-12-10 Vasil Khalidov , Florence Forbes , Radu Horaud

Clustering mixed data presents numerous challenges inherent to the very heterogeneous nature of the variables. A clustering algorithm should be able, despite of this heterogeneity, to extract discriminant pieces of information from the…

Machine Learning · Computer Science 2022-05-10 Robin Fuchs , Denys Pommeret , Cinzia Viroli

Healthcare cost prediction is a challenging task due to the high-dimensionality and high correlation among covariates. Additionally, the skewed, heavy-tailed, and often multi-modal nature of cost data can complicate matters further due to…

Methodology · Statistics 2023-03-13 Zhengxiao Li , Yifan Huang , Yang Cao

Mixture model-based clustering, usually applied to multidimensional data, has become a popular approach in many data analysis problems, both for its good statistical properties and for the simplicity of implementation of the…

Methodology · Statistics 2013-12-30 Allou Samé , Faicel Chamroukhi , Gérard Govaert , Patrice Aknin

High-dimensional data of discrete and skewed nature is commonly encountered in high-throughput sequencing studies. Analyzing the network itself or the interplay between genes in this type of data continues to present many challenges. As…

Methodology · Statistics 2017-12-01 Anjali Silva , Steven J. Rothstein , Paul D. McNicholas , Sanjeena Subedi

In several application domains, high-dimensional observations are collected and then analysed in search for naturally occurring data clusters which might provide further insights about the nature of the problem. In this paper we describe a…

Machine Learning · Statistics 2012-03-07 Brian McWilliams , Giovanni Montana

In the realm of precision medicine, effective patient stratification and disease subtyping demand innovative methodologies tailored for multi-omics data. Clustering techniques applied to multi-omics data have become instrumental in…

Machine Learning · Computer Science 2024-01-30 Bastian Pfeifer , Christel Sirocchi , Marcus D. Bloice , Markus Kreuzthaler , Martin Urschler

Cluster-weighted models (CWMs) extend finite mixtures of regressions (FMRs) in order to allow the distribution of covariates to contribute to the clustering process. In a matrix-variate framework, the matrix-variate normal CWM has been…

Model-based clustering is widely used for identifying and distinguishing types of diseases. However, modern biomedical data coming with high dimensions make it challenging to perform the model estimation in traditional cluster analysis. The…

Methodology · Statistics 2025-07-22 Kazeem Kareem , Fan Dai

Recent studies have demonstrated the effectiveness of clustering-based approaches for self-supervised and unsupervised learning. However, the application of clustering is often heuristic, and the optimal methodology remains unclear. In this…

Machine Learning · Computer Science 2025-11-10 Xiaodong Wang , Jing Huang , Kevin J Liang

We propose a novel method for multiple clustering that assumes a co-clustering structure (partitions in both rows and columns of the data matrix) in each view. The new method is applicable to high-dimensional data. It is based on a…

A method for dimension reduction with clustering, classification, or discriminant analysis is introduced. This mixture model-based approach is based on fitting generalized hyperbolic mixtures on a reduced subspace within the paradigm of…

Methodology · Statistics 2017-10-09 Katherine Morris , Paul D. McNicholas
‹ Prev 1 2 3 10 Next ›