Related papers: Direct covariance matrix estimation with compositi…

Large Covariance Estimation for Compositional Data via Composition-Adjusted Thresholding

High-dimensional compositional data arise naturally in many applications such as metagenomic data analysis. The observed data lie in a high-dimensional simplex, and conventional statistical methods often fail to produce sensible results due…

Methodology · Statistics 2016-01-19 Yuanpei Cao , Wei Lin , Hongzhe Li

Covariance Matrix Estimation for High-Throughput Biomedical Data with Interconnected Communities

Estimating a covariance matrix is central to high-dimensional data analysis. Empirical analyses of high-dimensional biomedical data, including genomics, proteomics, microbiome, and neuroimaging, among others, consistently reveal strong…

Methodology · Statistics 2024-12-05 Yifan Yang , Chixiang Chen , Shuo Chen

Robust Covariance Estimation for High-dimensional Compositional Data with Application to Microbial Communities Analysis

Microbial communities analysis is drawing growing attention due to the rapid development of high-throughput sequencing techniques nowadays. The observed data has the following typical characteristics: it is high-dimensional, compositional…

Methodology · Statistics 2020-04-30 Yong He , Pengfei Liu , Xinsheng Zhang , Wang Zhou

Regression models for compositional data: General log-contrast formulations, proximal optimization, and microbiome data applications

Compositional data sets are ubiquitous in science, including geology, ecology, and microbiology. In microbiome research, compositional data primarily arise from high-throughput sequence-based profiling experiments. These data comprise…

Statistics Theory · Mathematics 2019-03-05 Patrick L. Combettes , Christian L. Müller

Regression Analysis for Microbiome Compositional Data

One important problem in microbiome analysis is to identify the bacterial taxa that are associated with a response, where the microbiome data are summarized as the composition of the bacterial taxa at different taxonomic levels. This paper…

Applications · Statistics 2016-03-04 Pixu Shi , Anru Zhang , Hongzhe Li

Random Matrix Improved Covariance Estimation for a Large Class of Metrics

Relying on recent advances in statistical estimation of covariance distances based on random matrix theory, this article proposes an improved covariance and precision matrix estimation for a wide family of metrics. The method is shown to…

Machine Learning · Statistics 2021-02-03 Malik Tiomoko , Florent Bouchard , Guillaume Ginholac , Romain Couillet

Sparse Positive-Definite Estimation for Covariance Matrices with Repeated Measurements

Repeated measurements are common in many fields, where random variables are observed repeatedly across different subjects. Such data have an underlying hierarchical structure, and it is of interest to learn covariance/correlation at…

Methodology · Statistics 2023-06-13 Sunpeng Duan , Guo Yu , Juntao Duan , Yuedong Wang

Joint Covariance Estimation with Mutual Linear Structure

We consider the problem of joint estimation of structured covariance matrices. Assuming the structure is unknown, estimation is achieved using heterogeneous training sets. Namely, given groups of measurements coming from centered…

Statistics Theory · Mathematics 2016-04-20 Ilya Soloveychik , Ami Wiesel

Robust high-dimensional precision matrix estimation

The dependency structure of multivariate data can be analyzed using the covariance matrix $\Sigma$. In many fields the precision matrix $\Sigma^{-1}$ is even more informative. As the sample covariance estimator is singular in…

Methodology · Statistics 2015-06-04 Viktoria Öllerer , Christophe Croux

Latent Network Estimation and Variable Selection for Compositional Data via Variational EM

Network estimation and variable selection have been extensively studied in the statistical literature, but only recently have those two challenges been addressed simultaneously. In this paper, we seek to develop a novel method to…

Methodology · Statistics 2024-06-11 Nathan Osborne , Christine B. Peterson , Marina Vannucci

Optimal covariance matrix estimation for high-dimensional noise in high-frequency data

We consider high-dimensional measurement errors with high-frequency data. Our objective is on recovering the high-dimensional cross-sectional covariance matrix of the random errors with optimality. In this problem, not all components of the…

Statistics Theory · Mathematics 2024-04-03 Jinyuan Chang , Qiao Hu , Cheng Liu , Cheng Yong Tang

A Structured Estimator for large Covariance Matrices in the Presence of Pairwise and Spatial Covariates

We consider the problem of estimating a high-dimensional covariance matrix from a small number of observations when covariates on pairs of variables are available and the variables can have spatial structure. This is motivated by the…

Methodology · Statistics 2024-11-08 Martin Metodiev , Marie Perrot-Dockès , Sarah Ouadah , Bailey K. Fosdick , Stéphane Robin , Pierre Latouche , Adrian E. Raftery

A Compound Decision Approach to Covariance Matrix Estimation

Covariance matrix estimation is a fundamental statistical task in many applications, but the sample covariance matrix is sub-optimal when the sample size is comparable to or less than the number of features. Such high-dimensional settings…

Methodology · Statistics 2022-06-06 Huiqin Xin , Sihai Dave Zhao

Instrumental Variable Estimation for Compositional Treatments

Many scientific datasets are compositional in nature. Important biological examples include species abundances in ecology, cell-type compositions derived from single-cell sequencing data, and amplicon abundance data in microbiome research.…

Machine Learning · Computer Science 2024-05-29 Elisabeth Ailer , Christian L. Müller , Niki Kilbertus

Capturing Between-Tasks Covariance and Similarities Using Multivariate Linear Mixed Models

We consider the problem of predicting several response variables using the same set of explanatory variables. This setting naturally induces a group structure over the coefficient matrix, in which every explanatory variable corresponds to a…

Methodology · Statistics 2019-10-03 Aviv Navon , Saharon Rosset

Covariance estimation for vertically partitioned data in a distributed environment

The major sources of abundant data are constantly expanding with the available data collection methodologies in various applications - medical, insurance, scientific, bio-informatics and business. These data sets may be distributed…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-06-24 Aruna Govada , Sanjay K. Sahay

Generalized Linear Models with Linear Constraints for Microbiome Compositional Data

Motivated by regression analysis for microbiome compositional data, this paper considers generalized linear regression analysis with compositional covariates, where a group of linear constraints on regression coefficients are imposed to…

Methodology · Statistics 2018-01-11 Jiarui Lu , Pixu Shi , Hongzhe Li

Hypothesis-driven mediation analysis for compositional data: an application to gut microbiome

Biological sequencing data consist of read counts, e.g. of specified taxa and often exhibit sparsity (zero-count inflation) and overdispersion (extra-Poisson variability). As most sequencing techniques provide an arbitrary total count,…

Applications · Statistics 2024-07-01 Noora Kartiosuo , Jaakko Nevalainen , Olli Raitakari , Katja Pahkala , Kari Auranen

Cross-study analyses of microbial abundance using generalized common factor methods

By creating networks of biochemical pathways, communities of micro-organisms are able to modulate the properties of their environment and even the metabolic processes within their hosts. Next-generation high-throughput sequencing has led to…

Applications · Statistics 2023-03-28 Molly G. Hayes , Morgan G. I. Langille , Hong Gu

Robust and Well-conditioned Sparse Estimation for High-dimensional Covariance Matrices

Estimating covariance matrices with high-dimensional complex data presents significant challenges, particularly concerning positive definiteness, sparsity, and numerical stability. Existing robust sparse estimators often fail to guarantee…

Methodology · Statistics 2025-12-30 Shaoxin Wang , Ziyun Ma