Related papers: Interpretable dimension reduction for compositiona…

Interpretable Discriminative Dimensionality Reduction and Feature Selection on the Manifold

Dimensionality reduction (DR) on the manifold includes effective methods which project the data from an implicit relational space onto a vectorial space. Regardless of the achievements in this area, these algorithms suffer from the lack of…

Machine Learning · Computer Science 2019-09-23 Babak Hosseini , Barbara Hammer

Reproducing Kernels and New Approaches in Compositional Data Analysis

Compositional data, such as human gut microbiomes, consist of non-negative variables whose only the relative values to other variables are available. Analyzing compositional data such as human gut microbiomes needs a careful treatment of…

Machine Learning · Statistics 2022-05-04 Binglin Li , Jeongyoun Ahn

It's All Relative: New Regression Paradigm for Microbiome Compositional Data

Microbiome data are complex in nature, involving high dimensionality, compositionally, zero inflation, and taxonomic hierarchy. Compositional data reside in a simplex that does not admit the standard Euclidean geometry. Most existing…

Methodology · Statistics 2020-11-12 Gen Li , Yan Li , Kun Chen

High-dimensional Log-Error-in-Variable Regression with Applications to Microbial Compositional Data Analysis

In microbiome and genomic studies, the regression of compositional data has been a crucial tool for identifying microbial taxa or genes that are associated with clinical phenotypes. To account for the variation in sequencing depth, the…

Methodology · Statistics 2021-03-11 Pixu Shi , Yuchen Zhou , Anru R. Zhang

Solving Interpretable Kernel Dimension Reduction

Kernel dimensionality reduction (KDR) algorithms find a low dimensional representation of the original data by optimizing kernel dependency measures that are capable of capturing nonlinear relationships. The standard strategy is to first…

Machine Learning · Statistics 2019-09-26 Chieh Wu , Jared Miller , Yale Chang , Mario Sznaier , Jennifer Dy

Local Explanation of Dimensionality Reduction

Dimensionality reduction (DR) is a popular method for preparing and analyzing high-dimensional data. Reduced data representations are less computationally intensive and easier to manage and visualize, while retaining a significant…

Machine Learning · Computer Science 2022-05-02 Avraam Bardos , Ioannis Mollas , Nick Bassiliades , Grigorios Tsoumakas

An adaptive composite quantile approach to dimension reduction

Sufficient dimension reduction [J. Amer. Statist. Assoc. 86 (1991) 316-342] has long been a prominent issue in multivariate nonparametric regression analysis. To uncover the central dimension reduction space, we propose in this paper an…

Statistics Theory · Mathematics 2014-08-15 Efang Kong , Yingcun Xia

An Interpretable Compression and Classification System: Theory and Applications

This study proposes a low-complexity interpretable classification system. The proposed system contains three main modules including feature extraction, feature reduction, and classification. All of them are linear. Thanks to the linear…

Computer Vision and Pattern Recognition · Computer Science 2020-04-15 Tzu-Wei Tseng , Kai-Jiun Yang , C. -C. Jay Kuo , Shang-Ho , Tsai

Supervised Learning and Model Analysis with Compositional Data

The compositionality and sparsity of high-throughput sequencing data poses a challenge for regression and classification. However, in microbiome research in particular, conditional modeling is an essential tool to investigate relationships…

Machine Learning · Statistics 2023-07-19 Shimeng Huang , Elisabeth Ailer , Niki Kilbertus , Niklas Pfister

Principal component analysis for high-dimensional compositional data

Dimension reduction for high-dimensional compositional data plays an important role in many fields, where the principal component analysis of the basis covariance matrix is of scientific interest. In practice, however, the basis variables…

Methodology · Statistics 2021-09-13 Jingru Zhang , Wei Lin

Large Covariance Estimation for Compositional Data via Composition-Adjusted Thresholding

High-dimensional compositional data arise naturally in many applications such as metagenomic data analysis. The observed data lie in a high-dimensional simplex, and conventional statistical methods often fail to produce sensible results due…

Methodology · Statistics 2016-01-19 Yuanpei Cao , Wei Lin , Hongzhe Li

CARE: Large Precision Matrix Estimation for Compositional Data

High-dimensional compositional data are prevalent in many applications. The simplex constraint poses intrinsic challenges to inferring the conditional dependence relationships among the components forming a composition, as encoded by a…

Methodology · Statistics 2024-03-25 Shucong Zhang , Huiyuan Wang , Wei Lin

Data Augmentation for Compositional Data: Advancing Predictive Models of the Microbiome

Data augmentation plays a key role in modern machine learning pipelines. While numerous augmentation strategies have been studied in the context of computer vision and natural language processing, less is known for other data modalities.…

Machine Learning · Statistics 2022-05-23 Elliott Gordon-Rodriguez , Thomas P. Quinn , John P. Cunningham

An Infinite Dimensional Analysis of Kernel Principal Components

We study non-linear data-dimension reduction. We are motivated by the classical linear framework of Principal Component Analysis. In nonlinear case, we introduce instead a new kernel-Principal Component Analysis, manifold and feature space…

Functional Analysis · Mathematics 2022-09-09 Palle E. T. Jorgensen , Sooran Kang , Myung-Sin Song , Feng Tian

Debiased high-dimensional regression calibration for errors-in-variables log-contrast models

Motivated by the challenges in analyzing gut microbiome and metagenomic data, this work aims to tackle the issue of measurement errors in high-dimensional regression models that involve compositional covariates. This paper marks a…

Methodology · Statistics 2024-09-13 Huali Zhao , Tianying Wang

A critical comparison of handling zeros in high-dimensional compositional count data

The growing use of high-throughput sequencing (HTS) has enabled the large-scale production of compositional count data, driving progress in microbiome research. However, such count data are often high-dimensional, over-dispersed, and…

Other Statistics · Statistics 2026-05-22 Wenqi Tang , Kamila Fačevicová , Klaus Nordhausen , Sara Taskinen

Principal Subsimplex Analysis

Compositional data, also referred to as simplicial data, naturally arise in many scientific domains such as geochemistry, microbiology, and economics. In such domains, obtaining sensible lower-dimensional representations and modes of…

Methodology · Statistics 2025-04-15 Hyeon Lee , Kassel Liam Hingee , Janice L. Scealy , Andrew T. A. Wood , Eric Grunsky , J. S. Marron

Principal component analysis balancing prediction and approximation accuracy for spatial data

Dimension reduction is often the first step in statistical modeling or prediction of multivariate spatial data. However, most existing dimension reduction techniques do not account for the spatial correlation between observations and do not…

Methodology · Statistics 2025-05-27 Si Cheng , Magali N. Blanco , Timothy V. Larson , Lianne Sheppard , Adam Szpiro , Ali Shojaie

Incomplete Pivoted QR-based Dimensionality Reduction

High-dimensional big data appears in many research fields such as image recognition, biology and collaborative filtering. Often, the exploration of such data by classic algorithms is encountered with difficulties due to `curse of…

Machine Learning · Computer Science 2016-07-13 Amit Bermanis , Aviv Rotbart , Moshe Salhov , Amir Averbuch

Making Interpretable Discoveries from Unstructured Data: A High-Dimensional Multiple Hypothesis Testing Approach

Social scientists are increasingly turning to unstructured datasets to unlock new empirical insights, e.g., estimating descriptive statistics of or causal effects on quantitative measures derived from text, audio, or video data. In many…

Econometrics · Economics 2026-05-06 Jacob Carlson