Related papers: Efficient Algorithms for Multidimensional Segmente…

Fast Algorithms for Segmented Regression

We study the fixed design segmented regression problem: Given noisy samples from a piecewise linear function $f$, we want to recover $f$ up to a desired accuracy in mean-squared error. Previous rigorous approaches for this problem rely on…

Machine Learning · Computer Science 2016-07-15 Jayadev Acharya , Ilias Diakonikolas , Jerry Li , Ludwig Schmidt

Federated Sufficient Dimension Reduction Through High-Dimensional Sparse Sliced Inverse Regression

Federated learning has become a popular tool in the big data era nowadays. It trains a centralized model based on data from different clients while keeping data decentralized. In this paper, we propose a federated sparse sliced inverse…

Machine Learning · Statistics 2023-01-24 Wenquan Cui , Yue Zhao , Jianjun Xu , Haoyang Cheng

High Dimensional Robust Sparse Regression

We provide a novel -- and to the best of our knowledge, the first -- algorithm for high dimensional sparse regression with constant fraction of corruptions in explanatory and/or response variables. Our algorithm recovers the true sparse…

Machine Learning · Computer Science 2019-05-31 Liu Liu , Yanyao Shen , Tianyang Li , Constantine Caramanis

Computationally Efficient Robust Estimation of Sparse Functionals

Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions. This problem is exacerbated in modern high-dimensional settings, where the problem dimension can grow with and…

Machine Learning · Statistics 2017-02-27 Simon S. Du , Sivaraman Balakrishnan , Aarti Singh

Entangled Mean Estimation in High-Dimensions

We study the task of high-dimensional entangled mean estimation in the subset-of-signals model. Specifically, given $N$ independent random points $x_1,\ldots,x_N$ in $\mathbb{R}^D$ and a parameter $\alpha \in (0, 1)$ such that each $x_i$ is…

Data Structures and Algorithms · Computer Science 2025-01-10 Ilias Diakonikolas , Daniel M. Kane , Sihan Liu , Thanasis Pittas

Revisiting Marginal Regression

The lasso has become an important practical tool for high dimensional regression as well as the object of intense theoretical investigation. But despite the availability of efficient algorithms, the lasso remains computationally demanding…

Statistics Theory · Mathematics 2009-11-23 Christopher Genovese , Jiashun Jin , Larry Wasserman

Fast Perfekt: Regression-based refinement of fast simulation

The availability of precise and accurate simulation is a limiting factor for interpreting and forecasting data in many fields of science and engineering. Often, one or more distinct simulation software applications are developed, each with…

High Energy Physics - Experiment · Physics 2025-02-19 Moritz Wolf , Lars O. Stietz , Patrick L. S. Connor , Peter Schleper , Samuel Bein

A stochastic algorithm for fault inverse problems in elastic half space with proof of convergence

A general stochastic algorithm for solving mixed linear and nonlinear problems was introduced in [11]. We show in this paper how it can be used to solve the fault inverse problem, where a planar fault in elastic half-space and a slip on…

Numerical Analysis · Mathematics 2021-03-19 Darko Volkov

Efficient Distributed Learning with Sparsity

We propose a novel, efficient approach for distributed sparse learning in high-dimensions, where observations are randomly partitioned across machines. Computationally, at each round our method only requires the master machine to solve a…

Machine Learning · Statistics 2016-05-26 Jialei Wang , Mladen Kolar , Nathan Srebro , Tong Zhang

Bayesian Regression of Piecewise Constant Functions

We derive an exact and efficient Bayesian regression algorithm for piecewise constant functions of unknown segment number, boundary location, and levels. It works for any noise and segment level prior, e.g. Cauchy which can handle outliers.…

Statistics Theory · Mathematics 2007-06-13 Marcus Hutter

Distributed High-dimensional Regression Under a Quantile Loss Function

This paper studies distributed estimation and support recovery for high-dimensional linear regression model with heavy-tailed noise. To deal with heavy-tailed noise whose variance can be infinite, we adopt the quantile regression loss…

Methodology · Statistics 2020-09-21 Xi Chen , Weidong Liu , Xiaojun Mao , Zhuoyi Yang

Subset Selection for Multiple Linear Regression via Optimization

Subset selection in multiple linear regression aims to choose a subset of candidate explanatory variables that tradeoff fitting error (explanatory power) and model complexity (number of variables selected). We build mathematical programming…

Machine Learning · Statistics 2020-09-04 Young Woong Park , Diego Klabjan

Modified Multidimensional Scaling and High Dimensional Clustering

Multidimensional scaling is an important dimension reduction tool in statistics and machine learning. Yet few theoretical results characterizing its statistical performance exist, not to mention any in high dimensions. By considering a…

Methodology · Statistics 2022-03-30 Xiucai Ding , Qiang Sun

A new algorithm for estimating the effective dimension-reduction subspace

The statistical problem of estimating the effective dimension-reduction (EDR) subspace in the multi-index regression model with deterministic design and additive noise is considered. A new procedure for recovering the directions of the EDR…

Statistics Theory · Mathematics 2007-06-13 Arnak Dalalyan , Anatoly Juditsky , Vladimir Spokoiny

Reverse iterative volume sampling for linear regression

We study the following basic machine learning task: Given a fixed set of $d$-dimensional input points for a linear regression problem, we wish to predict a hidden response value for each of the points. We can only afford to attain the…

Machine Learning · Computer Science 2018-06-07 Michał Dereziński , Manfred K. Warmuth

Prediction in functional regression with discretely observed and noisy covariates

In practice functional data are sampled on a discrete set of observation points and often susceptible to noise. We consider in this paper the setting where such data are used as explanatory variables in a regression problem. If the primary…

Methodology · Statistics 2021-12-14 Siegfried Hörmann , Fatima Jammoul

Learning linear structural equation models in polynomial time and sample complexity

The problem of learning structural equation models (SEMs) from data is a fundamental problem in causal inference. We develop a new algorithm --- which is computationally and statistically efficient and works in the high-dimensional regime…

Machine Learning · Computer Science 2019-01-30 Asish Ghoshal , Jean Honorio

Numerical Analysis of the Non-uniform Sampling Problem

We give an overview of recent developments in the problem of reconstructing a band-limited signal from non-uniform sampling from a numerical analysis view point. It is shown that the appropriate design of the finite-dimensional model plays…

Numerical Analysis · Mathematics 2025-10-20 Thomas Strohmer

Robust High Dimensional Sparse Regression and Matching Pursuit

We consider high dimensional sparse regression, and develop strategies able to deal with arbitrary -- possibly, severe or coordinated -- errors in the covariance matrix $X$. These may come from corrupted data, persistent experimental…

Machine Learning · Statistics 2013-01-15 Yudong Chen , Constantine Caramanis , Shie Mannor

Least squares approximations in linear statistical inverse learning problems

Statistical inverse learning aims at recovering an unknown function $f$ from randomly scattered and possibly noisy point evaluations of another function $g$, connected to $f$ via an ill-posed mathematical model. In this paper we blend…

Statistics Theory · Mathematics 2024-01-22 Tapio Helin