Related papers: Robust Distributional Regression with Automatic Va…

Robust Estimation and Shrinkage in Ultrahigh Dimensional Expectile Regression with Heavy Tails and Variance Heterogeneity

High-dimensional data subject to heavy-tailed phenomena and heterogeneity are commonly encountered in various scientific fields and bring new challenges to the classical statistical methods. In this paper, we combine the asymmetric square…

Statistics Theory · Mathematics 2019-10-02 Jun Zhao , Guan'ao Yan , Yi Zhang

Adaptive Huber Regression

Big data can easily be contaminated by outliers or contain variables with heavy-tailed distributions, which makes many conventional methods inadequate. To address this challenge, we propose the adaptive Huber regression for robust…

Statistics Theory · Mathematics 2018-10-11 Qiang Sun , Wenxin Zhou , Jianqing Fan

Robust Gradient Descent Estimation for Tensor Models under Heavy-Tailed Distributions

Low-rank tensor models are widely used in statistics. However, most existing methods rely heavily on the assumption that data follows a sub-Gaussian distribution. To address the challenges associated with heavy-tailed distributions…

Methodology · Statistics 2025-09-16 Xiaoyu Zhang , Di Wang , Guodong Li , Defeng Sun

Semiparametric Expectile Regression for High-dimensional Heavy-tailed and Heterogeneous Data

Recently, high-dimensional heterogeneous data have attracted a lot of attention and discussion. Under heterogeneity, semiparametric regression is a popular choice to model data in statistics. In this paper, we take advantages of expectile…

Statistics Theory · Mathematics 2019-08-20 Jun Zhao , Guan'ao Yan , Yi Zhang

Kernel-Based Anomaly Detection Using Generalized Hyperbolic Processes

We present a novel approach to anomaly detection by integrating Generalized Hyperbolic (GH) processes into kernel-based methods. The GH distribution, known for its flexibility in modeling skewness, heavy tails, and kurtosis, helps to…

Machine Learning · Computer Science 2025-01-28 Pauline Bourigault , Danilo P. Mandic

Robust Inference for High-Dimensional Linear Models via Residual Randomization

We propose a residual randomization procedure designed for robust Lasso-based inference in the high-dimensional setting. Compared to earlier work that focuses on sub-Gaussian errors, the proposed procedure is designed to work robustly in…

Methodology · Statistics 2021-08-20 Y. Samuel Wang , Si Kai Lee , Panos Toulis , Mladen Kolar

User-Friendly Covariance Estimation for Heavy-Tailed Distributions

We offer a survey of recent results on covariance estimation for heavy-tailed distributions. By unifying ideas scattered in the literature, we propose user-friendly methods that facilitate practical implementation. Specifically, we…

Methodology · Statistics 2019-03-12 Yuan Ke , Stanislav Minsker , Zhao Ren , Qiang Sun , Wen-Xin Zhou

Robust Inference for High-dimensional Linear Models with Heavy-tailed Errors via Partial Gini Covariance

This paper introduces the partial Gini covariance, a novel dependence measure that addresses the challenges of high-dimensional inference with heavy-tailed errors, often encountered in fields like finance, insurance, climate, and biology.…

Methodology · Statistics 2024-11-21 Yilin Zhang , Songshan Yang , Yunan Wu , Lan Wang

Robust Filtering and Learning in State-Space Models: Skewness and Heavy Tails Via Asymmetric Laplace Distribution

State-space models are pivotal for dynamic system analysis but often struggle with outlier data that deviates from Gaussian distributions, frequently exhibiting skewness and heavy tails. This paper introduces a robust extension utilizing…

Signal Processing · Electrical Eng. & Systems 2025-07-31 Yifan Yu , Shengjie Xiu , Daniel P. Palomar

Robust and Sparse Generalized Linear Models for High-Dimensional Data via Maximum Mean Discrepancy

High-dimensional datasets are frequently subject to contamination by outliers and heavy-tailed noise, which can severely bias standard regularized estimators like the Lasso. While Maximum Mean Discrepancy (MMD) has recently been introduced…

Methodology · Statistics 2026-02-25 Xiaoning Kang , Lulu Kang

Generalized Rank Regression

Rank regression offers robustness to outliers and heavy-tailed response distributions, invariance to monotonic transformations, and improved efficiency under non-Gaussian errors, making it a versatile tool for analyzing complex data. This…

Methodology · Statistics 2026-05-25 Jiyuan Tu , Suqi Wu , Yichen Zhang , Wen-Xin Zhou

Log-Regularly Varying Scale Mixture of Normals for Robust Regression

Linear regression with the classical normality assumption for the error distribution may lead to an undesirable posterior inference of regression coefficients due to the potential outliers. This paper considers the finite mixture of two…

Methodology · Statistics 2021-01-12 Yasuyuki Hamura , Kaoru Irie , Shonosuke Sugasawa

Robust Tensor Regression with Nonconvexity: Algorithmic and Statistical Theory

Tensor regression is an important tool for tensor data analysis, but existing works have not considered the impact of outliers, making them potentially sensitive to such data points. This paper proposes a low tubal rank robust regression…

Methodology · Statistics 2026-05-11 Zihao Song , Jicai Liu , Heng Lian , Weihua Zhao

Distribution-dependent Generalization Bounds for Tuning Linear Regression Across Tasks

Modern regression problems often involve high-dimensional data and a careful tuning of the regularization hyperparameters is crucial to avoid overly complex models that may overfit the training data while guaranteeing desirable properties…

Machine Learning · Computer Science 2026-04-08 Maria-Florina Balcan , Saumya Goyal , Dravyansh Sharma

Emergence of heavy tails in homogenized stochastic gradient descent

It has repeatedly been observed that loss minimization by stochastic gradient descent (SGD) leads to heavy-tailed distributions of neural network parameters. Here, we analyze a continuous diffusion approximation of SGD, called homogenized…

Machine Learning · Statistics 2024-02-05 Zhe Jiao , Martin Keller-Ressel

Robust Linear Regression for General Feature Distribution

We investigate robust linear regression where data may be contaminated by an oblivious adversary, i.e., an adversary than may know the data distribution but is otherwise oblivious to the realizations of the data samples. This model has been…

Machine Learning · Computer Science 2022-02-07 Tom Norman , Nir Weinberger , Kfir Y. Levy

High-Dimensional Data Analysis for Elliptically Symmetric Distributions

High-dimensional data arise routinely in modern statistics, econometrics, finance, genomics, and machine learning. While a large body of existing methodology is developed under Gaussian or light-tailed assumptions, many real data sets…

Methodology · Statistics 2026-04-16 Long Feng

Classification of Heavy-tailed Features in High Dimensions: a Superstatistical Approach

We characterise the learning of a mixture of two clouds of data points with generic centroids via empirical risk minimisation in the high dimensional regime, under the assumptions of generic convex loss and convex regularisation. Each cloud…

Machine Learning · Statistics 2024-03-19 Urte Adomaityte , Gabriele Sicuro , Pierpaolo Vivo

Benign Overfitting of Constant-Stepsize SGD for Linear Regression

There is an increasing realization that algorithmic inductive biases are central in preventing overfitting; empirically, we often see a benign overfitting phenomenon in overparameterized settings for natural learning algorithms, such as…

Machine Learning · Computer Science 2021-10-14 Difan Zou , Jingfeng Wu , Vladimir Braverman , Quanquan Gu , Sham M. Kakade

Heavy Lasso: sparse penalized regression under heavy-tailed noise via data-augmented soft-thresholding

High-dimensional linear regression is a fundamental tool in modern statistics, particularly when the number of predictors exceeds the sample size. The classical Lasso, which relies on the squared loss, performs well under Gaussian noise…

Methodology · Statistics 2025-06-10 The Tien Mai