Related papers: Efficient Algorithms for Multidimensional Segmente…
We study the fixed design segmented regression problem: Given noisy samples from a piecewise linear function $f$, we want to recover $f$ up to a desired accuracy in mean-squared error. Previous rigorous approaches for this problem rely on…
Federated learning has become a popular tool in the big data era nowadays. It trains a centralized model based on data from different clients while keeping data decentralized. In this paper, we propose a federated sparse sliced inverse…
We provide a novel -- and to the best of our knowledge, the first -- algorithm for high dimensional sparse regression with constant fraction of corruptions in explanatory and/or response variables. Our algorithm recovers the true sparse…
Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions. This problem is exacerbated in modern high-dimensional settings, where the problem dimension can grow with and…
We study the task of high-dimensional entangled mean estimation in the subset-of-signals model. Specifically, given $N$ independent random points $x_1,\ldots,x_N$ in $\mathbb{R}^D$ and a parameter $\alpha \in (0, 1)$ such that each $x_i$ is…
The lasso has become an important practical tool for high dimensional regression as well as the object of intense theoretical investigation. But despite the availability of efficient algorithms, the lasso remains computationally demanding…
The availability of precise and accurate simulation is a limiting factor for interpreting and forecasting data in many fields of science and engineering. Often, one or more distinct simulation software applications are developed, each with…
A general stochastic algorithm for solving mixed linear and nonlinear problems was introduced in [11]. We show in this paper how it can be used to solve the fault inverse problem, where a planar fault in elastic half-space and a slip on…
We propose a novel, efficient approach for distributed sparse learning in high-dimensions, where observations are randomly partitioned across machines. Computationally, at each round our method only requires the master machine to solve a…
We derive an exact and efficient Bayesian regression algorithm for piecewise constant functions of unknown segment number, boundary location, and levels. It works for any noise and segment level prior, e.g. Cauchy which can handle outliers.…
This paper studies distributed estimation and support recovery for high-dimensional linear regression model with heavy-tailed noise. To deal with heavy-tailed noise whose variance can be infinite, we adopt the quantile regression loss…
Subset selection in multiple linear regression aims to choose a subset of candidate explanatory variables that tradeoff fitting error (explanatory power) and model complexity (number of variables selected). We build mathematical programming…
Multidimensional scaling is an important dimension reduction tool in statistics and machine learning. Yet few theoretical results characterizing its statistical performance exist, not to mention any in high dimensions. By considering a…
The statistical problem of estimating the effective dimension-reduction (EDR) subspace in the multi-index regression model with deterministic design and additive noise is considered. A new procedure for recovering the directions of the EDR…
We study the following basic machine learning task: Given a fixed set of $d$-dimensional input points for a linear regression problem, we wish to predict a hidden response value for each of the points. We can only afford to attain the…
In practice functional data are sampled on a discrete set of observation points and often susceptible to noise. We consider in this paper the setting where such data are used as explanatory variables in a regression problem. If the primary…
The problem of learning structural equation models (SEMs) from data is a fundamental problem in causal inference. We develop a new algorithm --- which is computationally and statistically efficient and works in the high-dimensional regime…
We give an overview of recent developments in the problem of reconstructing a band-limited signal from non-uniform sampling from a numerical analysis view point. It is shown that the appropriate design of the finite-dimensional model plays…
We consider high dimensional sparse regression, and develop strategies able to deal with arbitrary -- possibly, severe or coordinated -- errors in the covariance matrix $X$. These may come from corrupted data, persistent experimental…
Statistical inverse learning aims at recovering an unknown function $f$ from randomly scattered and possibly noisy point evaluations of another function $g$, connected to $f$ via an ill-posed mathematical model. In this paper we blend…