Related papers: Testing significance of features by lassoed princi…

Principal component gene set enrichment (PCGSE)

Motivation: Although principal component analysis (PCA) is widely used for the dimensional reduction of biomedical data, interpretation of PCA results remains daunting. Most existing methods attempt to explain each principal component (PC)…

Quantitative Methods · Quantitative Biology 2015-08-24 H. Robert Frost , Zhigang Li , Jason H. Moore

Feature-specific inference for penalized regression using local false discovery rates

Penalized regression methods, most notably the lasso, are a popular approach to analyzing high-dimensional data. An attractive property of the lasso is that it naturally performs variable selection. An important area of concern, however, is…

Methodology · Statistics 2026-05-13 Ryan Miller , Patrick Breheny

Iterative Supervised Principal Components

In high-dimensional prediction problems, where the number of features may greatly exceed the number of training instances, fully Bayesian approach with a sparsifying prior is known to produce good results but is computationally challenging.…

Methodology · Statistics 2018-10-15 Juho Piironen , Aki Vehtari

Inference for feature selection using the Lasso with high-dimensional data

Penalized regression models such as the Lasso have proved useful for variable selection in many fields - especially for situations with high-dimensional data where the numbers of predictors far exceeds the number of observations. These…

Methodology · Statistics 2014-03-19 Kasper Brink-Jensen , Claus Thorn Ekstrøm

PLPCA: Persistent Laplacian Enhanced-PCA for Microarray Data Analysis

Over the years, Principal Component Analysis (PCA) has served as the baseline approach for dimensionality reduction in gene expression data analysis. It primary objective is to identify a subset of disease-causing genes from a vast pool of…

Algebraic Topology · Mathematics 2023-06-13 Sean Cottrell , Rui Wang , Guowei Wei

Optimal Discriminant Analysis in High-Dimensional Latent Factor Models

In high-dimensional classification problems, a commonly used approach is to first project the high-dimensional features into a lower dimensional space, and base the classification on the resulting lower dimensional projections. In this…

Statistics Theory · Mathematics 2025-08-05 Xin Bing , Marten Wegkamp

Statistical significance of variables driving systematic variation

There are a number of well-established methods such as principal components analysis (PCA) for automatically capturing systematic variation due to latent variables in large-scale genomic data. PCA and related methods may directly provide a…

Methodology · Statistics 2015-03-05 Neo Christopher Chung , John D. Storey

Principal component-guided sparse regression

We propose a new method for supervised learning, especially suited to wide data where the number of features is much greater than the number of observations. The method combines the lasso ($\ell_1$) sparsity penalty with a quadratic penalty…

Methodology · Statistics 2018-10-25 J. Kenneth Tay , Jerome Friedman , Robert Tibshirani

Predictive Correlation Screening: Application to Two-stage Predictor Design in High Dimension

We introduce a new approach to variable selection, called Predictive Correlation Screening, for predictor design. Predictive Correlation Screening (PCS) implements false positive control on the selected variables, is well suited to small…

Machine Learning · Statistics 2013-04-11 Hamed Firouzi , Bala Rajaratnam , Alfred Hero

PC Adjusted Testing for Low Dimensional Parameters

In this paper, we investigate the impact of high-dimensional Principal Component (PC) adjustments on inferring the effects of variables on outcomes, with a focus on applications in genetic association studies where PC adjustment is commonly…

Statistics Theory · Mathematics 2025-06-30 Sohom Bhattacharya , Rounak Dey , Rajarshi Mukherjee

Functional Principal Component Analysis and Randomized Sparse Clustering Algorithm for Medical Image Analysis

Due to advances in sensors, growing large and complex medical image data have the ability to visualize the pathological change in the cellular or even the molecular level or anatomical changes in tissues and organs. As a consequence, the…

Machine Learning · Statistics 2016-02-17 Nan Lin , Junhai Jiang , Shicheng Guo , Momiao Xiong

Sharp detection in PCA under correlations: all eigenvalues matter

Principal component analysis (PCA) is a widely used method for dimension reduction. In high dimensional data, the "signal" eigenvalues corresponding to weak principal components (PCs) do not necessarily separate from the bulk of the "noise"…

Statistics Theory · Mathematics 2017-10-03 Edgar Dobriban

Supervised Linear Dimension-Reduction Methods: Review, Extensions, and Comparisons

Principal component analysis (PCA) is a well-known linear dimension-reduction method that has been widely used in data analysis and modeling. It is an unsupervised learning technique that identifies a suitable linear subspace for the input…

Machine Learning · Statistics 2021-09-10 Shaojie Xu , Joel Vaughan , Jie Chen , Agus Sudjianto , Vijayan Nair

Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes

Large language models (LLMs) have achieved remarkable success, yet aligning their generations with human preferences remains a critical challenge. Existing approaches to preference modeling often rely on an explicit or implicit reward…

Computation and Language · Computer Science 2025-05-09 Zhuocheng Gong , Jian Guan , Wei Wu , Huishuai Zhang , Dongyan Zhao

WinPCA: A package for windowed principal component analysis

Principal component analysis (PCA) is routinely used in population genetics to assess genetic structure. With chromosomal reference genomes and population-scale whole genome-sequencing becoming increasingly accessible, contemporary studies…

Populations and Evolution · Quantitative Biology 2025-01-22 L. Moritz Blumer , Jeffrey M. Good , Richard Durbin

Supervised Discriminative Sparse PCA for Com-Characteristic Gene Selection and Tumor Classification on Multiview Biological Data

Principal Component Analysis (PCA) has been used to study the pathogenesis of diseases. To enhance the interpretability of classical PCA, various improved PCA methods have been proposed to date. Among these, a typical method is the…

Machine Learning · Computer Science 2019-05-29 Chun-Mei Feng , Yong Xu , Jin-Xing Liu , Ying-Lian Gao , Chun-Hou Zheng

Principal Component Analysis for Experiments

Motivation: Although principal component analysis is frequently applied to reduce the dimensionality of matrix data, the method is sensitive to noise and bias and has difficulty with comparability and interpretation. These issues are…

Methodology · Statistics 2012-12-27 Tomokazu Konishi

Weighted principal component analysis: a weighted covariance eigendecomposition approach

We present a new straightforward principal component analysis (PCA) method based on the diagonalization of the weighted variance-covariance matrix through two spectral decomposition methods: power iteration and Rayleigh quotient iteration.…

Instrumentation and Methods for Astrophysics · Physics 2014-12-16 Ludovic Delchambre

Contrastive Principal Component Analysis

We present a new technique called contrastive principal component analysis (cPCA) that is designed to discover low-dimensional structure that is unique to a dataset, or enriched in one dataset relative to other data. The technique is a…

Machine Learning · Statistics 2017-11-23 Abubakar Abid , Martin J. Zhang , Vivek K. Bagaria , James Zou

Penalized Principal Component Analysis Using Smoothing

Principal components computed via PCA (principal component analysis) are traditionally used to reduce dimensionality in genomic data or to correct for population stratification. In this paper, we explore the penalized eigenvalue problem…

Applications · Statistics 2025-03-04 Rebecca M. Hurwitz , Georg Hahn