Related papers: Robust Differential Abundance Test in Compositiona…
Identifying which taxa in our microbiota are associated with traits of interest is important for advancing science and health. However, the identification is challenging because the measured vector of taxa counts (by amplicon sequencing) is…
In microbiome and genomic studies, the regression of compositional data has been a crucial tool for identifying microbial taxa or genes that are associated with clinical phenotypes. To account for the variation in sequencing depth, the…
Differential abundance analysis is at the core of statistical analysis of microbiome data. The compositional nature of microbiome sequencing data makes false positive control challenging. Here, we show that the compositional effects can be…
Differential abundance analysis is a key component of microbiome studies. Although dozens of methods exist there is currently no consensus on the preferred methods. While the correctness of results in differential abundance analysis is an…
Compositional data (i.e., data comprising random variables that sum up to a constant) arises in many applications including microbiome studies, chemical ecology, political science, and experimental designs. Yet when compositional data serve…
A popular approach for comparing gene expression levels between (replicated) conditions of RNA sequencing data relies on counting reads that map to features of interest. Within such count-based methods, many flexible and advanced…
Compositional data arise in many areas of research in the natural and biomedical sciences. One prominent example is in the study of the human gut microbiome, where one can measure the relative abundance of many distinct microorganisms in a…
The Dirichlet-multinomial (DM) distribution plays a fundamental role in modern statistical methodology development and application. Recently, the DM distribution and its variants have been used extensively to model multivariate count data…
Not many tests exist for testing the equality for two or more multivariate distributions with compositional data, perhaps due to their constrained sample space. At the moment, there is only one test suggested that relies upon random…
Scientific studies in the last two decades have established the central role of the microbiome in disease and health. Differential abundance analysis seeks to identify microbial taxa associated with sample groups defined by a factor such as…
Many biological high-throughput data sets, such as targeted amplicon-based and metagenomic sequencing data, are compositional in nature. A common exploratory data analysis task is to infer statistical associations between the…
The restricted polynomially-tilted pairwise interaction (RPPI) distribution gives a flexible model for compositional data. It is particularly well-suited to situations where some of the marginal distributions of the components of a…
In our paper, we focus on robust variable selection for missing data and measurement error. Missing data and measurement errors can lead to confusing data distribution. We propose an exponential loss function with a tuning parameter to…
Statistical analysis on compositional data has gained a lot of attention due to their great potential of applications. A feature of these data is that they are multivariate vectors that lie in the simplex, that is, the components of each…
Robust causal discovery from observational data under imperfect prior knowledge remains a significant and largely unresolved challenge. Existing methods typically presuppose perfect priors or can only handle specific, pre-identified error…
We investigate one/two-sample mean tests for high-dimensional compositional data when the number of variables is comparable with the sample size, as commonly encountered in microbiome research. Existing methods mainly focus on max-type test…
Experimental design has emerged as a powerful approach for improving the sample efficiency of A/B testing, yet existing designs rely critically on correctly specified models. We study robust sequential experimental design under model…
In observational studies, covariates with substantial missing data are often omitted, despite their strong predictive capabilities. These excluded covariates are generally believed not to simultaneously affect both treatment and outcome,…
To ensure reliable causal conclusions from observational (i.e., non-randomized) studies, researchers routinely conduct sensitivity analysis to assess robustness to hidden bias due to unmeasured confounding. In matched observational studies…
Knowledge distillation (KD) has enabled remarkable progress in model compression and knowledge transfer. However, KD requires a large volume of original data or their representation statistics that are not usually available in practice.…