Related papers: AutoClassWeb: a simple web interface for Bayesian …

BioKlustering: a web app for semi-supervised learning of maximally imbalanced genomic data

Summary: Accurate phenotype prediction from genomic sequences is a highly coveted task in biological and medical research. While machine-learning holds the key to accurate prediction in a variety of fields, the complexity of biological data…

Genomics · Quantitative Biology 2024-12-17 Samuel Ozminkowski , Yuke Wu , Hailey Bruzzone , Liule Yang , Zhiwen Xu , Luke Selberg , Chunrong Huang , Helena Jaramillo-Mesa , Claudia Solis-Lemus

Bayesian Consensus Clustering

The task of clustering a set of objects based on multiple sources of data arises in several modern applications. We propose an integrative statistical model that permits a separate clustering of the objects for each data source. These…

Machine Learning · Statistics 2015-12-01 Eric F. Lock , David B. Dunson

CLASSify: A Web-Based Tool for Machine Learning

Machine learning classification problems are widespread in bioinformatics, but the technical knowledge required to perform model training, optimization, and inference can prevent researchers from utilizing this technology. This article…

Machine Learning · Computer Science 2023-10-06 Aaron D. Mullen , Samuel E. Armstrong , Jeff Talbert , V. K. Cody Bumgardner

Outcome-guided Bayesian Clustering for Disease Subtype Discovery Using High-dimensional Transcriptomic Data

The discovery of disease subtypes is an essential step for developing precision medicine, and disease subtyping via omics data has become a popular approach. While promising, subtypes obtained from conventional approaches may not be…

Applications · Statistics 2023-09-28 Lingsong Meng , Zhiguang Huo

Bayesian Bi-clustering Methods with Applications in Computational Biology

Bi-clustering is a useful approach in analyzing biological data when observations come from heterogeneous groups and have a large number of features. We outline a general Bayesian approach in tackling bi-clustering problems in moderate to…

Applications · Statistics 2021-02-11 Han Yan , Jiexing Wu , Yang Li , Jun S. Liu

Bayesian outcome-guided multi-view mixture models with applications in molecular precision medicine

Clustering is commonly performed as an initial analysis step for uncovering structure in 'omics datasets, e.g. to discover molecular subtypes of disease. The high-throughput, high-dimensional nature of these datasets means that they provide…

Methodology · Statistics 2023-03-02 Paul D. W. Kirk , Filippo Pagani , Sylvia Richardson

Automatic Clustering for Unsupervised Risk Diagnosis of Vehicle Driving for Smart Road

Early risk diagnosis and driving anomaly detection from vehicle stream are of great benefits in a range of advanced solutions towards Smart Road and crash prevention, although there are intrinsic challenges, especially lack of ground truth,…

Machine Learning · Computer Science 2024-10-01 Xiupeng Shi , Yiik Diew Wong , Chen Chai , Michael Zhi-Feng Li , Tianyi Chen , Zeng Zeng

A Privacy-Aware Bayesian Approach for Combining Classifier and Cluster Ensembles

This paper introduces a privacy-aware Bayesian approach that combines ensembles of classifiers and clusterers to perform semi-supervised and transductive learning. We consider scenarios where instances and their classification/clustering…

Machine Learning · Computer Science 2012-04-23 Ayan Acharya , Eduardo R. Hruschka , Joydeep Ghosh

CLAMS: A System for Zero-Shot Model Selection for Clustering

We propose an AutoML system that enables model selection on clustering problems by leveraging optimal transport-based dataset similarity. Our objective is to establish a comprehensive AutoML pipeline for clustering problems and provide…

Machine Learning · Computer Science 2024-07-17 Prabhant Singh , Pieter Gijsbers , Murat Onur Yildirim , Elif Ceren Gok , Joaquin Vanschoren

EXCLUVIS: A MATLAB GUI Software for Comparative Study of Clustering and Visualization of Gene Expression Data

Clustering is a popular data mining technique that aims to partition an input space into multiple homogeneous regions. There exist several clustering algorithms in the literature. The performance of a clustering algorithm depends on its…

Human-Computer Interaction · Computer Science 2020-08-20 Sudip Poddar , Anirban Mukhopadhyay

BayesCPclust: A Bayesian Approach for Clustering Constant-Wise Change-Point Data

Change-point models deal with ordered data sequences. Their primary goal is to infer the locations where an aspect of the data sequence changes. In this paper, we propose and implement a nonparametric Bayesian model for clustering…

Methodology · Statistics 2025-02-12 Ana Carolina da Cruz , Camila P. E. de Souza

Bayesian Clustering Prior with Overlapping Indices for Effective Use of Multisource External Data

The use of external data in clinical trials offers numerous advantages, such as reducing the number of patients, increasing study power, and shortening trial durations. In Bayesian inference, information in external data can be transferred…

Methodology · Statistics 2025-09-17 Xuetao Lu , J. Jack Lee

Bayesian Level Set Clustering

Classically, Bayesian clustering interprets each component of a mixture model as a cluster. The inferred clustering posterior is highly sensitive to any inaccuracies in the kernel within each component. As this kernel is made more flexible,…

Methodology · Statistics 2025-12-12 David Buch , Miheer Dewaskar , David B. Dunson

Adaptive Bayesian Variable Clustering via Structural Learning of Breast Cancer Data

Clustering of proteins is of interest in cancer cell biology. This article proposes a hierarchical Bayesian model for protein (variable) clustering hinging on correlation structure. Starting from a multivariate normal likelihood, we enforce…

Computation · Statistics 2022-02-09 Riddhi Pratim Ghosh , Arnab Kumar Maity , Mohsen Pourahmadi , Bani K. Mallick

Auto-weighted Multi-view Clustering for Large-scale Data

Multi-view clustering has gained broad attention owing to its capacity to exploit complementary information across multiple data views. Although existing methods demonstrate delightful clustering performance, most of them are of high time…

Machine Learning · Computer Science 2023-03-06 Xinhang Wan , Xinwang Liu , Jiyuan Liu , Siwei Wang , Yi Wen , Weixuan Liang , En Zhu , Zhe Liu , Lu Zhou

Probabilistic Combination of Classifier and Cluster Ensembles for Non-transductive Learning

Unsupervised models can provide supplementary soft constraints to help classify new target data under the assumption that similar objects in the target set are more likely to share the same class label. Such models can also help detect…

Machine Learning · Computer Science 2015-03-13 Ayan Acharya , Eduardo R. Hruschka , Joydeep Ghosh , Badrul Sarwar , Jean-David Ruvini

A Bayesian Semiparametric Factor Analysis Model for Subtype Identification

Disease subtype identification (clustering) is an important problem in biomedical research. Gene expression profiles are commonly utilized to infer disease subtypes, which often lead to biologically meaningful insights into disease. Despite…

Methodology · Statistics 2016-09-27 Jiehuan Sun , Joshua L. Warren , Hongyu Zhao

Improved Acyclicity Reasoning for Bayesian Network Structure Learning with Constraint Programming

Bayesian networks are probabilistic graphical models with a wide range of application areas including gene regulatory networks inference, risk analysis and image processing. Learning the structure of a Bayesian network (BNSL) from discrete…

Artificial Intelligence · Computer Science 2021-06-24 Fulya Trösser , Simon de Givry , George Katsirelos

Bayesian Cluster Enumeration Criterion for Unsupervised Learning

We derive a new Bayesian Information Criterion (BIC) by formulating the problem of estimating the number of clusters in an observed data set as maximization of the posterior probability of the candidate models. Given that some mild…

Statistics Theory · Mathematics 2018-08-28 Freweyni K. Teklehaymanot , Michael Muma , Abdelhak M. Zoubir

A shortest-path based clustering algorithm for joint human-machine analysis of complex datasets

Clustering is a technique for the analysis of datasets obtained by empirical studies in several disciplines with a major application for biomedical research. Essentially, clustering algorithms are executed by machines aiming at finding…

Quantitative Methods · Quantitative Biology 2024-09-30 Diego Ulisse Pizzagalli , Santiago Fernandez Gonzalez , Rolf Krause