Related papers: DISCOMAX: A Proximity-Preserving Distance Correlat…

Supervised Dimensionality Reduction via Distance Correlation Maximization

In our work, we propose a novel formulation for supervised dimensionality reduction based on a nonlinear dependency criterion called Statistical Distance Correlation, Szekely et. al. (2007). We propose an objective which is free of…

Machine Learning · Computer Science 2016-01-05 Praneeth Vepakomma , Chetan Tonde , Ahmed Elgammal

Feature Selection with Distance Correlation

Choosing which properties of the data to use as input to multivariate decision algorithms -- a.k.a. feature selection -- is an important step in solving any problem with machine learning. While there is a clear trend towards training…

High Energy Physics - Phenomenology · Physics 2022-12-02 Ranit Das , Gregor Kasieczka , David Shih

Feature Selection Based on Orthogonal Constraints and Polygon Area

The goal of feature selection is to choose the optimal subset of features for a recognition task by evaluating the importance of each feature, thereby achieving effective dimensionality reduction. Currently, proposed feature selection…

Machine Learning · Computer Science 2024-02-27 Zhenxing Zhang , Jun Ge , Zheng Wei , Chunjie Zhou , Yilei Wang

Linear Regression without Correspondences via Concave Minimization

Linear regression without correspondences concerns the recovery of a signal in the linear regression setting, where the correspondences between the observations and the linear functionals are unknown. The associated maximum likelihood…

Information Theory · Computer Science 2020-09-15 Liangzu Peng , Manolis C. Tsakiris

COMBSS: Best Subset Selection via Continuous Optimization

The problem of best subset selection in linear regression is considered with the aim to find a fixed size subset of features that best fits the response. This is particularly challenging when the total available number of features is very…

Methodology · Statistics 2023-11-28 Sarat Moka , Benoit Liquet , Houying Zhu , Samuel Muller

Adversarial Dependence Minimization

Many machine learning techniques rely on minimizing the covariance between output feature dimensions to extract minimally redundant representations from data. However, these methods do not eliminate all dependencies/redundancies, as…

Machine Learning · Computer Science 2025-03-12 Pierre-François De Plaen , Tinne Tuytelaars , Marc Proesmans , Luc Van Gool

Feature Augmentations for High-Dimensional Learning

High-dimensional measurements are often correlated which motivates their approximation by factor models. This holds also true when features are engineered via low-dimensional interactions or kernel tricks. This often results in over…

Applications · Statistics 2025-09-03 Xiaonan Zhu , Bingyan Wang , Jianqing Fan

Solving the Best Subset Selection Problem via Suboptimal Algorithms

Best subset selection in linear regression is well known to be nonconvex and computationally challenging to solve, as the number of possible subsets grows rapidly with increasing dimensionality of the problem. As a result, finding the…

Machine Learning · Statistics 2025-04-01 Vikram Singh , Min Sun

Hardness and Algorithms for Robust and Sparse Optimization

We explore algorithms and limitations for sparse optimization problems such as sparse linear regression and robust linear regression. The goal of the sparse linear regression problem is to identify a small number of key features, while the…

Machine Learning · Computer Science 2022-06-30 Eric Price , Sandeep Silwal , Samson Zhou

Signal extraction approach for sparse multivariate response regression

In this paper, we consider multivariate response regression models with high dimensional predictor variables. One way to model the correlation among the response variables is through the low rank decomposition of the coefficient matrix,…

Methodology · Statistics 2015-08-06 Ruiyan Luo , Xin Qi

Principal component analysis balancing prediction and approximation accuracy for spatial data

Dimension reduction is often the first step in statistical modeling or prediction of multivariate spatial data. However, most existing dimension reduction techniques do not account for the spatial correlation between observations and do not…

Methodology · Statistics 2025-05-27 Si Cheng , Magali N. Blanco , Timothy V. Larson , Lianne Sheppard , Adam Szpiro , Ali Shojaie

Fast Distributed Approximation for Max-Cut

Finding a maximum cut is a fundamental task in many computational settings. Surprisingly, it has been insufficiently studied in the classic distributed settings, where vertices communicate by synchronously sending messages to their…

Data Structures and Algorithms · Computer Science 2017-07-27 Keren Censor-Hillel , Rina Levy , Hadas Shachnai

Max-Margin Feature Selection

Many machine learning applications such as in vision, biology and social networking deal with data in high dimensions. Feature selection is typically employed to select a subset of features which im- proves generalization accuracy as well…

Machine Learning · Computer Science 2016-06-15 Yamuna Prasad , Dinesh Khandelwal , K. K. Biswas

Approximation Algorithms for Optimization of Combinatorial Dynamical Systems

This paper considers an optimization problem for a dynamical system whose evolution depends on a collection of binary decision variables. We develop scalable approximation algorithms with provable suboptimality bounds to provide…

Optimization and Control · Mathematics 2016-10-31 Insoon Yang , Samuel A. Burden , Ram Rajagopal , S. Shankar Sastry , Claire J. Tomlin

Simultaneously Learning Neighborship and Projection Matrix for Supervised Dimensionality Reduction

Explicitly or implicitly, most of dimensionality reduction methods need to determine which samples are neighbors and the similarity between the neighbors in the original highdimensional space. The projection matrix is then learned on the…

Computer Vision and Pattern Recognition · Computer Science 2017-09-12 Yanwei Pang , Bo Zhou , Feiping Nie

Efficient Learning of Minimax Risk Classifiers in High Dimensions

High-dimensional data is common in multiple areas, such as health care and genomics, where the number of features can be tens of thousands. In such scenarios, the large number of features often leads to inefficient learning. Constraint…

Machine Learning · Statistics 2023-06-13 Kartheek Bondugula , Santiago Mazuelas , Aritz Pérez

Feature selection using nearest attributes

Feature selection is an important problem in high-dimensional data analysis and classification. Conventional feature selection approaches focus on detecting the features based on a redundancy criterion using learning and feature searching…

Computer Vision and Pattern Recognition · Computer Science 2012-01-31 Alex Pappachen James , Sima Dimitrijev

The Stochastic Replica Approach to Machine Learning: Stability and Parameter Optimization

We introduce a statistical physics inspired supervised machine learning algorithm for classification and regression problems. The method is based on the invariances or stability of predicted results when known data is represented as…

Machine Learning · Statistics 2018-11-19 Patrick Chao , Tahereh Mazaheri , Bo Sun , Nicholas B. Weingartner , Zohar Nussinov

Feature Selection: A perspective on inter-attribute cooperation

High-dimensional datasets depict a challenge for learning tasks in data mining and machine learning. Feature selection is an effective technique in dealing with dimensionality reduction. It is often an essential data processing step prior…

Machine Learning · Computer Science 2023-09-18 Gustavo Sosa-Cabrera , Santiago Gómez-Guerrero , Miguel García-Torres , Christian E. Schaerer

Guided Signal Reconstruction with Application to Image Magnification

We study the problem of reconstructing a signal from its projection on a subspace. The proposed signal reconstruction algorithms utilize a guiding subspace that represents desired properties of reconstructed signals. We show that optimal…

Information Theory · Computer Science 2016-06-13 Akshay Gadde , Andrew Knyazev , Dong Tian , Hassan Mansour