Related papers: Bayesian Variable Selection for Probit Mixed Model…
Finite mixture model is an important branch of clustering methods and can be applied on data sets with mixed types of variables. However, challenges exist in its applications. First, it typically relies on the EM algorithm which could be…
Bayesian variable selection is a powerful tool for data analysis, as it offers a principled method for variable selection that accounts for prior information and uncertainty. However, wider adoption of Bayesian variable selection has been…
We propose a novel Bayesian model selection technique on linear mixed-effects models to compare multiple treatments with a control. A fully Bayesian approach is implemented to estimate the marginal inclusion probabilities that provide a…
In large-scale genomic applications vast numbers of molecular features are scanned in order to find a small number of candidates which are linked to a particular disease or phenotype. This is a variable selection problem in the "large p,…
Variable selection has played a critical role in modern statistical learning and scientific discoveries. Numerous regularization and Bayesian variable selection methods have been developed in the past two decades for variable selection, but…
This paper presents a new modeling strategy for joint unsupervised analysis of multiple high-throughput biological studies. As in Multi-study Factor Analysis, our goals are to identify both common factors shared across studies and…
Machine Learning methods have of late made significant efforts to solving multidisciplinary problems in the field of cancer classification using microarray gene expression data. Feature subset selection methods can play an important role in…
Bayesian variable selection is a powerful tool for data analysis, as it offers a principled method for variable selection that accounts for prior information and uncertainty. However, wider adoption of Bayesian variable selection has been…
We present an applied study in cancer genomics for integrating data and inferences from laboratory experiments on cancer cell lines with observational data obtained from human breast cancer studies. The biological focus is on improving…
Significant advances in biotechnology have allowed for simultaneous measurement of molecular data points across multiple genomic and transcriptomic levels from a single tumor/cancer sample. This has motivated systematic approaches to…
Understanding how stochastic gene expression is regulated in biological systems using snapshots of single-cell transcripts requires state-of-the-art methods of computational analysis and statistical inference. A Bayesian approach to…
Numerous studies have shown that microbial metabolites, which represent the products of bacteria in the human gut, play a key role in shaping cancer risk and response to treatment. However, metabolite data typically contain a large…
Subset selection is a valuable tool for interpretable learning, scientific discovery, and data compression. However, classical subset selection is often avoided due to selection instability, lack of regularization, and difficulties with…
A substantial focus of research in molecular biology are gene regulatory networks: the set of transcription factors and target genes which control the involvement of different biological processes in living cells. Previous statistical…
Clustering is commonly performed as an initial analysis step for uncovering structure in 'omics datasets, e.g. to discover molecular subtypes of disease. The high-throughput, high-dimensional nature of these datasets means that they provide…
The aggregation of microarray datasets originating from different studies is still a difficult open problem. Currently, best results are generally obtained by the so-called meta-analysis approach, which aggregates results from individual…
Gene and protein networks are very important to model complex large-scale systems in molecular biology. Inferring or reverseengineering such networks can be defined as the process of identifying gene/protein interactions from experimental…
We propose a new methodology for selecting and ranking covariates associated with a variable of interest in a context of high-dimensional data under dependence but few observations. The methodology successively intertwines the clustering of…
In data sets with many predictors, algorithms for identifying a good subset of predictors are often used. Most such algorithms do not account for any relationships between predictors. For example, stepwise regression might select a model…
We present a coherent Bayesian framework for selection of the most likely model from the five genetic models (genotypic, additive, dominant, co-dominant, and recessive) commonly used in genetic association studies. The approach uses a…