English
Related papers

Related papers: Sparse Variable Selection on High Dimensional Hete…

200 papers

In high-dimensions, many variable selection methods, such as the lasso, are often limited by excessive variability and rank deficiency of the sample covariance matrix. Covariance sparsity is a natural phenomenon in high-dimensional…

Methodology · Statistics 2010-06-08 X. Jessie Jeng And Z. John Daye

In many modern applications, including analysis of gene expression and text documents, the data are noisy, high-dimensional, and unordered--with no particular meaning to the given order of the variables. Yet, successful learning is often…

Methodology · Statistics 2008-07-25 Ann B. Lee , Boaz Nadler , Larry Wasserman

Objective: Social-environmental data obtained from the U.S. Census is an important resource for understanding health disparities, but rarely is the full dataset utilized for analysis. A barrier to incorporating the full data is a lack of…

Applications · Statistics 2020-09-02 Elizabeth Handorf , Yinuo Yin , Michael Slifker , Shannon Lynch

In this paper, we introduce Adaptive Cluster Lasso(ACL) method for variable selection in high dimensional sparse regression models with strongly correlated variables. To handle correlated variables, the concept of clustering or grouping…

Machine Learning · Statistics 2016-03-14 Niharika Gauraha , Swapan K. Parui

Variable selection for high-dimensional, highly correlated data has long been a challenging problem, often yielding unstable and unreliable models. We propose a resample-aggregate framework that exploits diffusion models' ability to…

Methodology · Statistics 2025-08-20 Minjie Wang , Xiaotong Shen , Wei Pan

This paper studies model selection consistency for high dimensional sparse regression when data exhibits both cross-sectional and serial dependency. Most commonly-used model selection methods fail to consistently recover the true model when…

Methodology · Statistics 2018-09-12 Jianqing Fan , Yuan Ke , Kaizheng Wang

Sparse modelling or model selection with categorical data is challenging even for a moderate number of variables, because one parameter is roughly needed to encode one category or level. The Group Lasso is a well known efficient algorithm…

Methodology · Statistics 2022-11-14 Szymon Nowakowski , Piotr Pokarowski , Wojciech Rejchel , Agnieszka Sołtys

Analysis of high-dimensional data is currently a popular field of research, thanks to many applications e.g. in genetics (DNA data in genomewide association studies), spectrometry or web analysis. At the same time, the type of problems that…

Methodology · Statistics 2018-05-25 Jozef Jakubik

We consider the high-dimensional discriminant analysis problem. For this problem, different methods have been proposed and justified by establishing exact convergence rates for the classification risk, as well as the l2 convergence results…

Machine Learning · Statistics 2013-06-28 Mladen Kolar , Han Liu

This paper is concerned with high-dimensional panel data models where the number of regressors can be much larger than the sample size. Under the assumption that the true parameter vector is sparse we propose a panel-Lasso estimator and…

Statistics Theory · Mathematics 2014-02-14 Anders Bredahl Kock

Motivation: The high dimensionality of genomic data calls for the development of specific classification methodologies, especially to prevent over-optimistic predictions. This challenge can be tackled by compression and variable selection,…

Methodology · Statistics 2021-04-10 G. Durif , L. Modolo , J. Michaelsson , J. E. Mold , S. Lambert-Lacroix , F. Picard

Standard high-dimensional regression methods assume that the underlying coefficient vector is sparse. This might not be true in some cases, in particular in presence of hidden, confounding variables. Such hidden confounding can be…

Methodology · Statistics 2020-08-19 Domagoj Ćevid , Peter Bühlmann , Nicolai Meinshausen

Many high-dimensional data sets suffer from hidden confounding which affects both the predictors and the response of interest. In such situations, standard regression methods or algorithms lead to biased estimates. This paper substantially…

Methodology · Statistics 2024-12-17 Cyrill Scheidegger , Zijian Guo , Peter Bühlmann

We examine the linear regression problem in a challenging high-dimensional setting with correlated predictors where the vector of coefficients can vary from sparse to dense. In this setting, we propose a combination of probabilistic…

Methodology · Statistics 2025-05-13 Roman Parzer , Peter Filzmoser , Laura Vana-Gür

This paper presents an innovative approach to dimensionality reduction and feature extraction in high-dimensional datasets, with a specific application focus on wood surface defect detection. The proposed framework integrates sparse…

Machine Learning · Computer Science 2024-10-01 Harish Neelam , Koushik Sai Veerella , Souradip Biswas

We propose a novel structure selection method for high dimensional (d > 100) sparse vine copulas. Current sequential greedy approaches for structure selection require calculating spanning trees in hundreds of dimensions and fitting the pair…

Methodology · Statistics 2017-05-18 Dominik Müller , Claudia Czado

Decision trees are widely-used classification and regression models because of their interpretability and good accuracy. Classical methods such as CART are based on greedy approaches but a growing attention has recently been devoted to…

Machine Learning · Computer Science 2021-12-16 Edoardo Amaldi , Antonio Consolo , Andrea Manno

This paper investigates the high-dimensional linear regression with highly correlated covariates. In this setup, the traditional sparsity assumption on the regression coefficients often fails to hold, and consequently many model selection…

Methodology · Statistics 2019-03-26 Jianqing Fan , Bai Jiang , Qiang Sun

In genomic studies, identifying biomarkers associated with a variable of interest is a major concern in biomedical research. Regularized approaches are classically used to perform variable selection in high-dimensional linear models.…

Methodology · Statistics 2020-07-22 Wencan Zhu , Céline Lévy-Leduc , Nils Ternès

Sparse linear regression is a central problem in high-dimensional statistics. We study the correlated random design setting, where the covariates are drawn from a multivariate Gaussian $N(0,\Sigma)$, and we seek an estimator with small…

Data Structures and Algorithms · Computer Science 2023-05-29 Jonathan Kelner , Frederic Koehler , Raghu Meka , Dhruv Rohatgi
‹ Prev 1 2 3 10 Next ›