English
Related papers

Related papers: Robust distance correlation for variable screening

200 papers

Independence screening is a variable selection method that uses a ranking criterion to select significant variables, particularly for statistical models with nonpolynomial dimensionality or "large p, small n" paradigms when p can be as…

Methodology · Statistics 2012-10-18 Gaorong Li , Heng Peng , Jun Zhang , Lixing Zhu

Variable screening is a fast dimension reduction technique for assisting high dimensional feature selection. As a preselection method, it selects a moderate size subset of candidate variables for further refining via feature selection to…

Statistics Theory · Mathematics 2015-06-09 Xiangyu Wang , Chenlei Leng , David B. Dunson

As a computationally fast and working efficient tool, sure independence screening has received much attention in solving ultrahigh dimensional problems. This paper contributes two robust sure screening approaches that simultaneously take…

Methodology · Statistics 2021-07-27 Xiaochao Xia

Feature screening approaches are effective in selecting active features from data with ultrahigh dimensionality and increasing complexity; however, the majority of existing feature screening approaches are either restricted to a univariate…

Methodology · Statistics 2023-05-09 Shaofei Zhao , Guifang Fu

Variable selection in ultrahigh-dimensional linear regression is challenging due to its high computational cost. Therefore, a screening step is usually conducted before variable selection to significantly reduce the dimension. Here we…

Methodology · Statistics 2025-04-29 Run Wang , An Nguyen , Somak Dutta , Vivekananda Roy

Advancement in technology has generated abundant high-dimensional data that allows integration of multiple relevant studies. Due to their huge computational advantage, variable screening methods based on marginal correlation have become…

Methodology · Statistics 2017-10-12 Tianzhou Ma , Zhao Ren , George C. Tseng

This paper is concerned with screening features in ultrahigh dimensional data analysis, which has become increasingly important in diverse scientific fields. We develop a sure independence screening procedure based on the distance…

Methodology · Statistics 2012-06-04 Runze Li , Wei Zhong , Liping Zhu

Variable selection is of increasing importance to address the difficulties of high dimensionality in many scientific areas. In this paper, we demonstrate a property for distance covariance, which is incorporated in a novel feature screening…

Methodology · Statistics 2014-09-03 Jing Kong , Sijian Wang , Grace Wahba

A ubiquitous feature of data of our era is their extra-large sizes and dimensions. Analyzing such high-dimensional data poses significant challenges, since the feature dimension is often much larger than the sample size. This thesis…

Statistics Theory · Mathematics 2025-09-11 Kai Yang

Variable selection plays an important role in high dimensional statistical modeling which nowadays appears in many areas and is key to various scientific discoveries. For problems of large scale or dimensionality $p$, estimation accuracy…

Statistics Theory · Mathematics 2008-08-27 Jianqing Fan , Jinchi Lv

This paper proposes a model-free and data-adaptive feature screening method for ultra-high dimensional datasets. The proposed method is based on the projection correlation which measures the dependence between two random vectors. This…

Methodology · Statistics 2021-02-16 Wanjun Liu , Yuan Ke , Jingyuan Liu , Runze Li

Variable selection in ultra-high dimensional regression problems has become an important issue. In such situations, penalized regression models may face computational problems and some pre screening of the variables may be necessary. A…

Methodology · Statistics 2020-05-01 Abhik Ghosh , Magne Thoresen

Gini distance correlation (GDC) was recently proposed to measure the dependence between a categorical variable, Y, and a numerical random vector, X. It mutually characterizes independence between X and Y. In this article, we utilize the GDC…

Methodology · Statistics 2023-04-19 Yongli Sang , Xin Dang

We study scalable alternatives to robust gradient descent (RGD) techniques that can be used when the losses and/or gradients can be heavy-tailed, though this will be unknown to the learner. The core technique is simple: instead of trying to…

Machine Learning · Statistics 2020-12-15 Matthew J. Holland

This paper treats the problem of screening for variables with high correlations in high dimensional data in which there can be many fewer samples than variables. We focus on threshold-based correlation screening methods for three related…

Machine Learning · Statistics 2015-03-18 Alfred O. Hero , Bala Rajaratnam

The applications of traditional statistical feature selection methods to high-dimension, low sample-size data often struggle and encounter challenging problems, such as overfitting, curse of dimensionality, computational infeasibility, and…

Machine Learning · Statistics 2023-12-19 Kexuan Li , Fangfang Wang , Lingli Yang , Ruiqi Liu

We study a scalable alternative to robust gradient descent (RGD) techniques that can be used when the gradients can be heavy-tailed, though this will be unknown to the learner. The core technique is simple: instead of trying to robustly…

Machine Learning · Statistics 2020-12-16 Matthew J. Holland

We suggest a robust nearest-neighbor approach to classifying high-dimensional data. The method enhances sensitivity by employing a threshold and truncates to a sequence of zeros and ones in order to reduce the deleterious impact of…

Statistics Theory · Mathematics 2009-09-02 Yao-ban Chan , Peter Hall

Long-tailed classification is challenging due to its heavy imbalance in class probabilities. While existing methods often focus on overall accuracy or accuracy for tail classes, they overlook a critical aspect: certain types of errors can…

Machine Learning · Computer Science 2025-01-27 Bolian Li , Ruqi Zhang

Sure Independence Screening is a fast procedure for variable selection in ultra-high dimensional regression analysis. Unfortunately, its performance greatly deteriorates with increasing dependence among the predictors. To solve this issue,…

Methodology · Statistics 2018-11-15 Yixin Wang , Stefan Van Aelst
‹ Prev 1 2 3 10 Next ›