English
Related papers

Related papers: Bayesian Supervised Causal Clustering

200 papers

The task of clustering a set of objects based on multiple sources of data arises in several modern applications. We propose an integrative statistical model that permits a separate clustering of the objects for each data source. These…

Machine Learning · Statistics 2015-12-01 Eric F. Lock , David B. Dunson

Clustering has long been a popular unsupervised learning approach to identify groups of similar objects and discover patterns from unlabeled data in many applications. Yet, coming up with meaningful interpretations of the estimated clusters…

Methodology · Statistics 2020-05-26 Minjie Wang , Tianyi Yao , Genevera I. Allen

Estimating heterogeneous treatment effects is critical in domains such as personalized medicine, resource allocation, and policy evaluation. A central challenge lies in identifying subpopulations that respond differently to interventions,…

Machine Learning · Statistics 2025-09-18 Zilong Wang , Turgay Ayer , Shihao Yang

Clustering is commonly performed as an initial analysis step for uncovering structure in 'omics datasets, e.g. to discover molecular subtypes of disease. The high-throughput, high-dimensional nature of these datasets means that they provide…

Methodology · Statistics 2023-03-02 Paul D. W. Kirk , Filippo Pagani , Sylvia Richardson

Cluster analysis methods are used to identify homogeneous subgroups in a data set. In biomedical applications, one frequently applies cluster analysis in order to identify biologically interesting subgroups. In particular, one may wish to…

Methodology · Statistics 2016-09-23 Sheila Gaynor , Eric Bair

Bi-clustering is a useful approach in analyzing biological data when observations come from heterogeneous groups and have a large number of features. We outline a general Bayesian approach in tackling bi-clustering problems in moderate to…

Applications · Statistics 2021-02-11 Han Yan , Jiexing Wu , Yang Li , Jun S. Liu

Clustering is a crucial task in various domains of knowledge, including medicine, epidemiology, genomics, environmental science, economics, and visual sciences, among others. Methodologies for inferring the number of clusters have often…

Methodology · Statistics 2025-05-26 Clara Grazian

The discovery of disease subtypes is an essential step for developing precision medicine, and disease subtyping via omics data has become a popular approach. While promising, subtypes obtained from conventional approaches may not be…

Applications · Statistics 2023-09-28 Lingsong Meng , Zhiguang Huo

We derive a new Bayesian Information Criterion (BIC) by formulating the problem of estimating the number of clusters in an observed data set as maximization of the posterior probability of the candidate models. Given that some mild…

Statistics Theory · Mathematics 2018-08-28 Freweyni K. Teklehaymanot , Michael Muma , Abdelhak M. Zoubir

The paper presents a novel approach for unsupervised techniques in the field of clustering. A new method is proposed to enhance existing literature models using the proper Bayesian bootstrap to improve results in terms of robustness and…

Machine Learning · Statistics 2024-09-16 Federico Maria Quetti , Silvia Figini , Elena ballante

Choosing appropriate hyperparameters for unsupervised clustering algorithms in an optimal way depending on the problem under study is a long standing challenge, which we tackle while adapting clustering algorithms for immune disorder…

Quantitative Methods · Quantitative Biology 2020-09-25 A. Carpio , A. Simón , L. F. Villa

Determining phenotypes of diseases can have considerable benefits for in-hospital patient care and to drug development. The structure of high dimensional data sets such as electronic health records are often represented through an embedding…

Understanding treatment effect heterogeneity is vital for scientific and policy research. However, identifying and evaluating heterogeneous treatment effects pose significant challenges due to the typically unknown subgroup structure.…

Methodology · Statistics 2024-11-05 Kwangho Kim , Jisu Kim , Larry A. Wasserman , Edward H. Kennedy

We present a novel framework for concomitant dimension reduction and clustering. This framework is based on a novel class of Bayesian clustering factor models. These models assume a factor model structure where the vectors of common factors…

Methodology · Statistics 2025-05-09 Hwasoo Shin , Marco A. R. Ferreira , Allison N. Tegge

We consider an extension of model-based clustering to the semi-supervised case, where some of the data are pre-labeled. We provide a derivation of the Bayesian Information Criterion (BIC) approximation to the Bayes factor in this setting.…

Methodology · Statistics 2016-04-28 Jordan Yoder , Carey E. Priebe

Size-constrained clustering (SCC) refers to the dual problem of using observations to determine latent cluster structure while at the same time assigning observations to the unknown clusters subject to an analyst defined constraint on…

Applications · Statistics 2017-10-18 Justin D. Silverman , Rachel K. Silverman

Clustering is a well-known unsupervised machine learning approach capable of automatically grouping discrete sets of instances with similar characteristics. Constrained clustering is a semi-supervised extension to this process that can be…

Machine Learning · Computer Science 2023-03-02 Germán González-Almagro , Daniel Peralta , Eli De Poorter , José-Ramón Cano , Salvador García

Clustering mixed-type data remains a major challenge in biomedical research to uncover clinically meaningful subgroups within heterogeneous patient populations. Most existing clustering methods impose restrictive assumptions like local…

Applications · Statistics 2026-04-23 Yueting Wang , Shu Wang , Jonathan G. Yabes , Chung-Chou H. Chang

A key question in causal inference analyses is how to find subgroups with elevated treatment effects. This paper takes a machine learning approach and introduces a generative model, Causal Rule Sets (CRS), for interpretable subgroup…

Artificial Intelligence · Computer Science 2021-05-21 Tong Wang , Cynthia Rudin

Disease subtype identification (clustering) is an important problem in biomedical research. Gene expression profiles are commonly utilized to infer disease subtypes, which often lead to biologically meaningful insights into disease. Despite…

Methodology · Statistics 2016-09-27 Jiehuan Sun , Joshua L. Warren , Hongyu Zhao
‹ Prev 1 2 3 10 Next ›