Related papers: Best Subset Solution Path for Linear Dimension Red…

Supervised Linear Dimension-Reduction Methods: Review, Extensions, and Comparisons

Principal component analysis (PCA) is a well-known linear dimension-reduction method that has been widely used in data analysis and modeling. It is an unsupervised learning technique that identifies a suitable linear subspace for the input…

Machine Learning · Statistics 2021-09-10 Shaojie Xu , Joel Vaughan , Jie Chen , Agus Sudjianto , Vijayan Nair

Parameter Selection Algorithm For Continuous Variables

In this article, we propose a new algorithm for supervised learning methods, by which one can both capture the non-linearity in data and also find the best subset model. To produce an enhanced subset of the original variables, an ideal…

Applications · Statistics 2017-01-23 Peyman Tavallali , Marianne Razavi , Sean Brady

COMBSS: Best Subset Selection via Continuous Optimization

The problem of best subset selection in linear regression is considered with the aim to find a fixed size subset of features that best fits the response. This is particularly challenging when the total available number of features is very…

Methodology · Statistics 2023-11-28 Sarat Moka , Benoit Liquet , Houying Zhu , Samuel Muller

Solving the Best Subset Selection Problem via Suboptimal Algorithms

Best subset selection in linear regression is well known to be nonconvex and computationally challenging to solve, as the number of possible subsets grows rapidly with increasing dimensionality of the problem. As a result, finding the…

Machine Learning · Statistics 2025-04-01 Vikram Singh , Min Sun

Principal component analysis balancing prediction and approximation accuracy for spatial data

Dimension reduction is often the first step in statistical modeling or prediction of multivariate spatial data. However, most existing dimension reduction techniques do not account for the spatial correlation between observations and do not…

Methodology · Statistics 2025-05-27 Si Cheng , Magali N. Blanco , Timothy V. Larson , Lianne Sheppard , Adam Szpiro , Ali Shojaie

Principal Component Analysis When n < p: Challenges and Solutions

Principal Component Analysis is a key technique for reducing the complexity of high-dimensional data while preserving its fundamental data structure, ensuring models remain stable and interpretable. This is achieved by transforming the…

Methodology · Statistics 2025-03-25 Nuwan Weeraratne , Lyn Hunt , Jason Kurz

A Statistical View of Column Subset Selection

We consider the problem of selecting a small subset of representative variables from a large dataset. In the computer science literature, this dimensionality reduction problem is typically formalized as Column Subset Selection (CSS).…

Methodology · Statistics 2025-05-20 Anav Sood , Trevor Hastie

Prescriptive PCA: Dimensionality Reduction for Two-stage Stochastic Optimization

In this paper, we consider the alignment between an upstream dimensionality reduction task of learning a low-dimensional representation of a set of high-dimensional data and a downstream optimization task of solving a stochastic program…

Machine Learning · Computer Science 2024-03-13 Long He , Ho-Yin Mak

Dynamic Principal Component Analysis in High Dimensions

Principal component analysis is a versatile tool to reduce dimensionality which has wide applications in statistics and machine learning. It is particularly useful for modeling data in high-dimensional scenarios where the number of…

Methodology · Statistics 2022-08-18 Xiaoyu Hu , Fang Yao

Linear Dimensionality Reduction: Survey, Insights, and Generalizations

Linear dimensionality reduction methods are a cornerstone of analyzing high dimensional data, due to their simple geometric interpretations and typically attractive computational properties. These methods capture many data features of…

Machine Learning · Statistics 2016-03-22 John P. Cunningham , Zoubin Ghahramani

The Sparse Principal Component Analysis Problem: Optimality Conditions and Algorithms

Sparse principal component analysis addresses the problem of finding a linear combination of the variables in a given data set with a sparse coefficients vector that maximizes the variability of the data. This model enhances the ability to…

Optimization and Control · Mathematics 2017-03-09 Amir Beck , Yakov Vaisbourd

Exploring dimension learning via a penalized probabilistic principal component analysis

Establishing a low-dimensional representation of the data leads to efficient data learning strategies. In many cases, the reduced dimension needs to be explicitly stated and estimated from the data. We explore the estimation of dimension in…

Methodology · Statistics 2022-02-10 Wei Q. Deng , Radu V. Craiu

Adjusting systematic bias in high dimensional principal component scores

Principal component analysis continues to be a powerful tool in dimension reduction of high dimensional data. We assume a variance-diverging model and use the high-dimension, low-sample-size asymptotics to show that even though the…

Statistics Theory · Mathematics 2020-09-28 Sungkyu Jung

DPCA: Dimensionality Reduction for Discriminative Analytics of Multiple Large-Scale Datasets

Principal component analysis (PCA) has well-documented merits for data extraction and dimensionality reduction. PCA deals with a single dataset at a time, and it is challenged when it comes to analyzing multiple datasets. Yet in certain…

Machine Learning · Computer Science 2017-10-27 Gang Wang , Jia Chen , Georgios B. Giannakis

Partitioned Least Squares

In this paper we propose a variant of the linear least squares model allowing practitioners to partition the input features into groups of variables that they require to contribute similarly to the final result. The output allows…

Machine Learning · Computer Science 2024-07-17 Roberto Esposito , Mattia Cerrato , Marco Locatelli

Linear Tensor Projection Revealing Nonlinearity

Dimensionality reduction is an effective method for learning high-dimensional data, which can provide better understanding of decision boundaries in human-readable low-dimensional subspace. Linear methods, such as principal component…

Machine Learning · Computer Science 2020-07-09 Koji Maruhashi , Heewon Park , Rui Yamaguchi , Satoru Miyano

Dual Principal Component Pursuit: Probability Analysis and Efficient Algorithms

Recent methods for learning a linear subspace from data corrupted by outliers are based on convex $\ell_1$ and nuclear norm optimization and require the dimension of the subspace and the number of outliers to be sufficiently small. In sharp…

Machine Learning · Computer Science 2018-12-27 Zhihui Zhu , Yifan Wang , Daniel P. Robinson , Daniel Q. Naiman , Rene Vidal , Manolis C. Tsakiris

Maximum Margin Principal Components

Principal Component Analysis (PCA) is a very successful dimensionality reduction technique, widely used in predictive modeling. A key factor in its widespread use in this domain is the fact that the projection of a dataset onto its first…

Machine Learning · Statistics 2017-05-19 Xianghui Luo , Robert J. Durrant

Dynamic Supervised Principal Component Analysis for Classification

This paper introduces a novel framework for dynamic classification in high dimensional spaces, addressing the evolving nature of class distributions over time or other index variables. Traditional discriminant analysis techniques are…

Methodology · Statistics 2025-02-18 Wenbo Ouyang , Ruiyang Wu , Ning Hao , Hao Helen Zhang

Principal Component Analysis for Experiments

Motivation: Although principal component analysis is frequently applied to reduce the dimensionality of matrix data, the method is sensitive to noise and bias and has difficulty with comparability and interpretation. These issues are…

Methodology · Statistics 2012-12-27 Tomokazu Konishi