Related papers: Robust Differential Abundance Test in Compositiona…

Testing for differential abundance in compositional counts data, with application to microbiome studies

Identifying which taxa in our microbiota are associated with traits of interest is important for advancing science and health. However, the identification is challenging because the measured vector of taxa counts (by amplicon sequencing) is…

Genomics · Quantitative Biology 2020-03-31 Barak Brill , Amnon Amir , Ruth Heller

High-dimensional Log-Error-in-Variable Regression with Applications to Microbial Compositional Data Analysis

In microbiome and genomic studies, the regression of compositional data has been a crucial tool for identifying microbial taxa or genes that are associated with clinical phenotypes. To account for the variation in sequencing depth, the…

Methodology · Statistics 2021-03-11 Pixu Shi , Yuchen Zhou , Anru R. Zhang

LinDA: linear models for differential abundance analysis of microbiome compositional data

Differential abundance analysis is at the core of statistical analysis of microbiome data. The compositional nature of microbiome sequencing data makes false positive control challenging. Here, we show that the compositional effects can be…

Methodology · Statistics 2022-03-15 Huijuan Zhou , Kejun He , Jun Chen , Xianyang Zhang

Elementary methods provide more replicable results in microbial differential abundance analysis

Differential abundance analysis is a key component of microbiome studies. Although dozens of methods exist there is currently no consensus on the preferred methods. While the correctness of results in differential abundance analysis is an…

Applications · Statistics 2025-04-01 Juho Pelto , Kari Auranen , Janne Kujala , Leo Lahti

Compositional Covariate Importance Testing via Partial Conjunction of Bivariate Hypotheses

Compositional data (i.e., data comprising random variables that sum up to a constant) arises in many applications including microbiome studies, chemical ecology, political science, and experimental designs. Yet when compositional data serve…

Methodology · Statistics 2025-01-03 Ritwik Bhaduri , Siyuan Ma , Lucas Janson

Robustly detecting differential expression in RNA sequencing data using observation weights

A popular approach for comparing gene expression levels between (replicated) conditions of RNA sequencing data relies on counting reads that map to features of interest. Within such count-based methods, many flexible and advanced…

Quantitative Methods · Quantitative Biology 2014-03-17 Xiaobei Zhou , Helen Lindsay , Mark D. Robinson

Direct covariance matrix estimation with compositional data

Compositional data arise in many areas of research in the natural and biomedical sciences. One prominent example is in the study of the human gut microbiome, where one can measure the relative abundance of many distinct microorganisms in a…

Methodology · Statistics 2024-04-26 Aaron J. Molstad , Karl Oskar Ekvall , Piotr M. Suder

A Bayesian Zero-Inflated Dirichlet-Multinomial Regression Model for Multivariate Compositional Count Data

The Dirichlet-multinomial (DM) distribution plays a fundamental role in modern statistical methodology development and application. Recently, the DM distribution and its variants have been used extensively to model multivariate count data…

Methodology · Statistics 2023-02-27 Matthew D. Koslovsky

Energy Based Equality of Distributions Testing for Compositional Data

Not many tests exist for testing the equality for two or more multivariate distributions with compositional data, perhaps due to their constrained sample space. At the moment, there is only one test suggested that relies upon random…

Methodology · Statistics 2025-12-12 Volkan Sevinc , Michail Tsagris

A Bayesian Nonparametric Approach for Identifying Differentially Abundant Taxa in Multigroup Microbiome Data with Covariates

Scientific studies in the last two decades have established the central role of the microbiome in disease and health. Differential abundance analysis seeks to identify microbial taxa associated with sample groups defined by a factor such as…

Methodology · Statistics 2023-12-29 Archie Sachdeva , Somnath Datta , Subharup Guha

Robust Regression with Compositional Covariates

Many biological high-throughput data sets, such as targeted amplicon-based and metagenomic sequencing data, are compositional in nature. A common exploratory data analysis task is to infer statistical associations between the…

Methodology · Statistics 2020-07-28 Aditya Mishra , Christian L. Muller

Robust score matching for compositional data

The restricted polynomially-tilted pairwise interaction (RPPI) distribution gives a flexible model for compositional data. It is particularly well-suited to situations where some of the marginal distributions of the components of a…

Methodology · Statistics 2023-05-15 Janice L. Scealy , Kassel L. Hingee , John T. Kent , Andrew T. A. Wood

Robust Variable Selection for High-dimensional Regression with Missing Data and Measurement Errors

In our paper, we focus on robust variable selection for missing data and measurement error. Missing data and measurement errors can lead to confusing data distribution. We propose an exponential loss function with a tuning parameter to…

Methodology · Statistics 2025-07-01 Zhenhao Zhang , Yunquan Song

Robust Nonparametric Regression for Compositional Data: the Simplicial--Real case

Statistical analysis on compositional data has gained a lot of attention due to their great potential of applications. A feature of these data is that they are multivariate vectors that lie in the simplex, that is, the components of each…

Methodology · Statistics 2025-05-22 Ana M. Bianco , Graciela Boente , Wenceslao González--Manteiga , Francisco Gude Sampedro , Ana Pérez--González

Robust Causal Discovery under Imperfect Structural Constraints

Robust causal discovery from observational data under imperfect prior knowledge remains a significant and largely unresolved challenge. Existing methods typically presuppose perfect priors or can only handle specific, pre-identified error…

Machine Learning · Computer Science 2025-11-11 Zidong Wang , Xi Lin , Chuchao He , Xiaoguang Gao

On testing mean of high dimensional compositional data

We investigate one/two-sample mean tests for high-dimensional compositional data when the number of variables is comparable with the sample size, as commonly encountered in microbiome research. Existing methods mainly focus on max-type test…

Statistics Theory · Mathematics 2024-04-15 Qianqian Jiang , Wenbo Li , Zeng Li

Robust Sequential Experimental Design for A/B Testing

Experimental design has emerged as a powerful approach for improving the sample efficiency of A/B testing, yet existing designs rely critically on correctly specified models. We study robust sequential experimental design under model…

Machine Learning · Statistics 2026-05-14 Qianglin Wen , Xiangkun Wu , Chengchun Shi , Ting Li , Niansheng Tang , Yingying Zhang , Hongtu Zhu

Efficiency-improved doubly robust estimation with non-confounding predictive covariates

In observational studies, covariates with substantial missing data are often omitted, despite their strong predictive capabilities. These excluded covariates are generally believed not to simultaneously affect both treatment and outcome,…

Methodology · Statistics 2024-02-23 Shanshan Luo , Mengchen Shi , Wei Li , Xueli Wang , Zhi Geng

Towards Robust Matched Observational Studies with General Treatment Types: Consistency, Efficiency, and Adaptivity

To ensure reliable causal conclusions from observational (i.e., non-randomized) studies, researchers routinely conduct sensitivity analysis to assess robustness to hidden bias due to unmeasured confounding. In matched observational studies…

Methodology · Statistics 2025-11-11 Siyu Heng , Elaine K. Chiu , Hyunseung Kang

Robustness and Diversity Seeking Data-Free Knowledge Distillation

Knowledge distillation (KD) has enabled remarkable progress in model compression and knowledge transfer. However, KD requires a large volume of original data or their representation statistics that are not usually available in practice.…

Machine Learning · Computer Science 2021-02-11 Pengchao Han , Jihong Park , Shiqiang Wang , Yejun Liu