English
Related papers

Related papers: Supervised Learning and Model Analysis with Compos…

200 papers

Microbiome data are complex in nature, involving high dimensionality, compositionally, zero inflation, and taxonomic hierarchy. Compositional data reside in a simplex that does not admit the standard Euclidean geometry. Most existing…

Methodology · Statistics 2020-11-12 Gen Li , Yan Li , Kun Chen

The analysis of human microbiome data is often based on dimension-reduced graphical displays and clustering derived from vectors of microbial abundances in each sample. Common to these ordination methods is the use of biologically motivated…

Applications · Statistics 2017-01-11 Timothy W. Randolph , Sen Zhao , Wade Copeland , Meredith Hullar , Ali Shojaie

One important problem in microbiome analysis is to identify the bacterial taxa that are associated with a response, where the microbiome data are summarized as the composition of the bacterial taxa at different taxonomic levels. This paper…

Applications · Statistics 2016-03-04 Pixu Shi , Anru Zhang , Hongzhe Li

Compositional data, such as human gut microbiomes, consist of non-negative variables whose only the relative values to other variables are available. Analyzing compositional data such as human gut microbiomes needs a careful treatment of…

Machine Learning · Statistics 2022-05-04 Binglin Li , Jeongyoun Ahn

Machine learning models can represent climate processes that are nonlocal in horizontal space, height, and time, often by combining information across these dimensions in highly nonlinear ways. While this can improve predictive skill, it…

Machine Learning · Computer Science 2026-05-14 Savannah L. Ferretti , Jerry Lin , Sara Shamekh , Jane W. Baldwin , Michael S. Pritchard , Tom Beucler

Despite its importance, choosing the structural form of the kernel in nonparametric regression remains a black art. We define a space of kernel structures which are built compositionally by adding and multiplying a small number of base…

Machine Learning · Statistics 2013-05-15 David Duvenaud , James Robert Lloyd , Roger Grosse , Joshua B. Tenenbaum , Zoubin Ghahramani

Signal processing tasks as fundamental as sampling, reconstruction, minimum mean-square error interpolation and prediction can be viewed under the prism of reproducing kernel Hilbert spaces. Endowing this vantage point with contemporary…

Machine Learning · Computer Science 2013-02-25 Juan Andres Bazerque , Georgios B. Giannakis

High-dimensional compositional data, such as those from human microbiome studies, pose unique statistical challenges due to the simplex constraint and excess zeros. While dimension reduction is indispensable for analyzing such data,…

Methodology · Statistics 2025-09-09 Junyoung Park , Cheolwoo Park , Jeongyoun Ahn

Research in modern data-driven dynamical systems is typically focused on the three key challenges of high dimensionality, unknown dynamics, and nonlinearity. The dynamic mode decomposition (DMD) has emerged as a cornerstone for modeling…

Fluid Dynamics · Physics 2022-04-27 Peter J. Baddoo , Benjamin Herrmann , Beverley J. McKeon , Steven L. Brunton

By removing irrelevant and redundant features, feature selection aims to find a good representation of the original features. With the prevalence of unlabeled data, unsupervised feature selection has been proven effective in alleviating the…

Machine Learning · Computer Science 2024-03-25 Ziyuan Lin , Deanna Needell

Artificial neural networks show promising performance in detecting correlations within data that are associated with specific outcomes. However, the black-box nature of such models can hinder the knowledge advancement in research fields by…

Machine Learning · Computer Science 2023-10-09 Jonas C. Ditz , Bernhard Reuter , Nico Pfeifer

This paper considers the problem of kernel regression and classification with possibly unobservable response variables in the data, where the mechanism that causes the absence of information is unknown and can depend on both predictors and…

Statistics Theory · Mathematics 2022-12-07 Majid Mojirsheibani , William Pouliot , Andre Shakhbandaryan

Structural equation models (SEMs) have been widely adopted for inference of causal interactions in complex networks. Recent examples include unveiling topologies of hidden causal networks over which processes such as spreading diseases, or…

Machine Learning · Statistics 2017-04-05 Yanning Shen , Brian Baingana , Georgios B. Giannakis

We propose and analyze a novel framework for learning sparse representations, based on two statistical techniques: kernel smoothing and marginal regression. The proposed approach provides a flexible framework for incorporating feature…

Machine Learning · Statistics 2012-10-04 Krishnakumar Balasubramanian , Kai Yu , Guy Lebanon

Kernel regression is a popular non-parametric fitting technique. It aims at learning a function which estimates the targets for test inputs as precise as possible. Generally, the function value for a test input is estimated by a weighted…

Machine Learning · Computer Science 2017-12-27 Rongqing Huang , Shiliang Sun

Data imputation, the process of filling in missing feature elements for incomplete data sets, plays a crucial role in data-driven learning. A fundamental belief is that data imputation is helpful for learning performance, and it follows…

Machine Learning · Computer Science 2025-09-30 Ruikai Yang , Fan He , Mingzhen He , Kaijie Wang , Xiaolin Huang

Depth measures have gained popularity in the statistical literature for defining level sets in complex data structures like multivariate data, functional data, and graphs. Despite their versatility, integrating depth measures into…

In complex visual recognition tasks it is typical to adopt multiple descriptors, that describe different aspects of the images, for obtaining an improved recognition performance. Descriptors that have diverse forms can be fused into a…

Computer Vision and Pattern Recognition · Computer Science 2015-06-15 Jayaraman J. Thiagarajan , Karthikeyan Natesan Ramamurthy , Andreas Spanias

Modern Bayesian optimization and adaptive sampling methods increasingly rely on nonlinear parametric models, yet theoretical guarantees for such models under adaptive data collection remain limited. Existing analyses largely focus on…

Machine Learning · Statistics 2026-05-14 Rafael Oliveira

High-throughput pheno-, geno-, and envirotyping allows characterization of plant genotypes and the trials they are evaluated in, producing different types of data. These different data modalities can be integrated into statistical or…

‹ Prev 1 2 3 10 Next ›